Applied Dimensionality

Using Oracle Model clause in real-life )

Posted at — Apr 1, 2008
Using Oracle Model clause in real-life )

I’ve been a long-term fan of oracle model by clause, talking about here and there, doing some small&funny examples (like eight queens solution in one select) but lacked some real-life necessity to use it. It always turned out that partition by was enough, or some tricky analytical function was already built-in (like trend calculations I wrote about).

And at last, I’ve encountered a suitable problem.

Task is quite simple&straightforward: calculate number of periods in which current stock will be sold in a store. Like: today we have sales enough for 3 weeks. That’s a very common sales demand analysis parameter (Months of Sales (MoS), Weeks of Sales (WoS), DoS and etc).

Formula is quite simple: Current Stock / Average Sales over some period

But there were some difficulties in my case.

I’ve got only partial data, some periods were completely missing, and the requirement was to treat missing sales data as zeros. And I also had to get the last known stock as current stock, because stock data could be missing too.

Last stock is pretty easy to get using last_value oracle function, but getting average sales is hard. You cannot use fixed window width in partition by, since data can be missing. Moreover period varied other the years, so I’ve ended up adding up a field in time dimension called periods_ago, that contained time_key reference from which to sum sales.

So I’ve started with simple subqueries

select
        time_key,
        product,
        company,
        sales_qty,
        (select sum(sales_qty)
           from dwh.sales sales2
         where
            sales2.product = sales.product,
            sales2.company = sales.company,
            sales2.time_key between weeks.periods_ago and sales.time_key
        ) as running_sales_sum
from
        dwh.time_dim weeks, dwh.sales sales
        where
           weeks.key = sales.time_key

and that is terribly slow on 10^6+ rows. ~20 minutes.

But with model by it’ll be like:

select
        time,
        product,
        company,
        sales_qty,
        last_stock_qty
from
        dwh.time_dim weeks, dwh.sales sales
        where
           weeks.key = sales.time_key
      model
        IGNORE NAV
          partition by (product,company)
            dimension by (time_key)
            measures
            (
                periods_ago,
                sales_qty,
                0 eight_weeks_sales_sum_qty,
                nvl(LAST_VALUE(STOCK IGNORE NULLS)
                OVER (PARTITION BY COMPANY, PRODUCT, SEGMENT, DISTRIB_LEVEL, SYSTEM
                ORDER BY TIME ASC),0) AS last_stock_qty
            rules   UPDATE
            (
                running_sales_sum[any] = sum(sales_qty)[time between periods_ago[cv()] and cv(time_key)]
                end
			       )
            )

around 3 minutes ) Notice the last stock calculation.

Here’s a couple links to good model by examples:

A wonderful tutorial at SQLSnippets

Official documentation restriction section — read that carefully

comments powered by Disqus