Applied Dimensionality

DMR’s pros and cons

Posted at — Mar 13, 2008
DMR’s pros and cons

DMR stands for Dimensionally Modelled Relational, a Cognos modelling technique allowing to present relational data sources as OLAP cubes (this meaning adding notion of dimensions with hierarchies and measures with various aggregation rules). All OLAP-style queries, roll-ups\drill-downs are then transformed into appropriate sql (group by’s, aggregations) by Cognos Server. It’s not a Cognos-only idea, see Microsoft UDM (Unified Dimensional Model) and Oracle BI EE works essentially the same way. The whole idea of dimensions\measures seems easy to understand (not by relationally bread-ones, though) so this modelling concept is gaining popularity.

And Cognos EP published data is represented by DMRs automatically )

Having played with DMR’s on recent projects I’d like make some conclusions and grievances.

I’ll start with complaints section — more annoyance, more desire to write it out. If you know any solutions to problems stated below — please mail me.

Complaints:

1) No Sorting of Dimension Elements

We’ve encountered a simple but more than annoying problem: dimension elements wouldn’t sort anyhow.

With help on cognoise.com got to the idea that business keys sorting (alphabetical) would work for all levels, except bottom one. Then cognos support answer revealed that we had to sort elements in underlying database, so we’ve added order by clause for dimension queries.

Then it appeared that order by is ignored due to the whole dimension query is packed in partition by’s with min and etc. Manually patching oracle sql generation helps, but breaks other queries.

Using pass-through sql helps, but only as an insult — sorting works, but no query optimization from that point, only select * from and filtering being done at cognos server. Totally performance no-go.

And there’s no fix – officially Cognos BI doesn’t sort DMR modelled dimensions for now.

2) Dimension element Hide toggles full table scan

Some insights on how cross-tabs (all, including DMRs) work in Cognos BI.

Suppose you want a simple product vs years sales cross tab.

Just like this: |Products|2005|2006| |—|—|—| |Product 1| | | |Product 2| | | |Product 3| | |

What happens when run such a cross-tab against your huge fact table?

  1. Cognos generates
select year,product, sum(sales)
from sales_fact
group by year,product

returning 6 rows

  1. Result of the query are transformed into a small “cube” (powercube like) in temp\ directory (called ***.dmc)
  2. This “cube” is used as source for cross-tab

Cool, eh?

Now what will happen if we want to hide “Product 2”, for example?

  1. Cognos will generate
select year,product, sales
from sales_fact

returning millions of rows

  1. then filter them for <>Product 2 and sum qty locally
  2. create a ‘small’ cube and show you hidden result (I couldn’t wait for that, to be sincere, the mere sight of gigabyte temporary files being downloaded from dwh stressed me too much).

That’s a problem, I don’t understand why simple where Product <> in sql wouldn’t do the job.

3) Bottom level search impossible.

Having a long Customer dimension (50k of elements) — search by customer name like %Smith&% is vital. But if you’ve got hierarchies in DMRs, you can only search the current level, ‘search all details’ option is unavailable. We’ve had to create powercubes to get this functionality (and performance)

4) Hard dimensional formula writing\debug.

It ain’t easy to write MDX from the start, but it’s way harder in Cognos DMR’s.

Basically, it’s because of:

One of the recent examples was with dimension roles.

In time dimension I’ve had a special field, storing the last work day of week key.

I’ve thought that I could use it to create work days subset (like all works of the current week up to day chosen in prompt). Writing something like filter([Sales].[Time].[Day];[Sales].[Time].[Day] between roleValue('start\_of\_work\_week';[Selected Week]) and roleValue('\_businessKey';[Selected Week])). And it turned out that roleValue results can not be used for > comparison (that’s what I think after day spent testing). Why isn’t that explicitly written out?

It’s really hard to see what’s going under the hood with dimensional functions, what is passed, how it’s evaluated, how -> operator works and all that. Can I use [Sales].[Time].[Day]->roleValue('start\_of\_work_week';[Selected Week]) to get the desired day? And it isn’t written anywhere that I can find. So it all up to changing display types and trying to guess.

Pros, or why should you use DMRs:

All other things work -)

  1. Analysis studio is available and that’s a great + in my opinion. It’s really a nice tool, though still worse than client-side powerplay in aspects of advanced subsets and etc.
  2. You can write dimensional formulas (like PeriodsToDay, TopSum and a lot of others) in reports. Eases report design a lot.
  3. You can add drill-down functionality to your reports with no additional development requied.
  4. Dimension Attributes. There are none in PowerCubes and it’s a nice analytical capability that I sure don’t want to miss. Like filtering stores by square footage > 100 m^2 in Analysis Studio.
comments powered by Disqus