Applied Dimensionality

Calculating trend lines in Cognos Report Studio and Oracle SQL

Posted at — Jan 25, 2008
Calculating trend lines in Cognos Report Studio and Oracle SQL

I’m more than just emotional right now since I’ve finally got that bugger of my neck.

To those who’re interested in getting trend calculated in oracle — press page-down once.

Trend line calculation in Cognos

It’s 2008 as my desktop calendar says and it’s “Targeted information to a broader range of users ” on top of the Cognos website. And there I am with simple requirement to calculate a sales trend line.

Just like in Wikipedia.

You know, there’s a Trend function in Excel. Since I don’t know when (it’s Office 2000 here and it’s already deep there). And, moreover, in PowerPlay I’d just clicked forecast and linear regression and voila.

But it’s single-scalable-mighty Cognos 8 platform you have to write in yourself.

With an example from Cognos Knowledge base article .

Just that in my case I had a DMR and it didn’t work “from the box”. And I even don’t know what is the “box”, since there is no such package as “Great Outdoors Company” in standard samples.

It turned out that I had to change aggregate properties of [value 1 for count] and therefore I doubt that this sample works on DMR’s as such.

Works now but it took me nearly 4 days to get it working right (mostly trying to understand how sets work with DMRs).

BI for masses? Easy analytics? Yeah…

Trend line calc in Oracle

After spending a day trying to do it in Cognos, I’ve decided to give it a try at a database level.

There’s a bunch of more than helpful analytical functions in Oracle:

See full list

And this wonderful article on using them to calculate m and b for whole table.

Then you can encapsulate the whole calculation in one statement, using sub queries. The problem is that if you have only one measurement then SXX=0 and you can’t divide by zero, that’s what case stands there for.

With Andy‘s masterly help the final variant turns into something like this:

with data_t as
(
select 1 REG,1 X, 10 Y from sys.dual
union all
select 1 REG,2 X, 20 Y from sys.dual
union all
select 1 REG,3 X, 15 Y from sys.dual
union all
select 1 REG,4 X, 40 Y from sys.dual
union all
select 1 REG,5 X, 10 Y from sys.dual
union all
select 1 REG,6 X, 30 Y from sys.dual
union all
select 1 REG,7 X, NULL Y from sys.dual
union all
select 2 REG,1 X, 10 Y from sys.dual
union all
select 2 REG,2 X, 20 Y from sys.dual
union all
select 2 REG,3 X, 15 Y from sys.dual
union all
select 2 REG,4 X, 50 Y from sys.dual
union all
select 2 REG,5 X, 10 Y from sys.dual
union all
select 2 REG,6 X, 30 Y from sys.dual
union all
select 3 REG,1 X, 300 Y from sys.dual
)
select REG,X,
SUM(
case
when ((select REGR_SXX(Y,X) from data_t t1 where T.REG = t1.REG) = 0) then (Y)
else X*(select REGR_SXY(Y,X)/REGR_SXX(Y,X) from data_t t1 where T.REG = t1.REG)+(select REGR_AVGY(Y,X) - REGR_SXY(Y,X)/REGR_SXX(Y,X)*REGR_AVGX(Y,X) from data_t t2 where T.REG=t2.REG )
end
)
as trend
from data_t T
group by reg,X
order by reg,X

Note the case — it slow as hell, you’d have to split the query into two + union to avoid this case. But it’s still rather slow, around 10 min on 1m row table.

comments powered by Disqus