next up previous
Next: Cube Optimizations Up: Implementation and Optimizations Previous: Partial cubes

Number of aggregates to materialize for mining 2-way associations

In a full data cube, the number of aggregates is exponential. In the cube lattice there are n levels and for each level i there are tex2html_wrap_inline736 aggregates at each level containing i dimensions. The total number of aggregates, thus is tex2html_wrap_inline692. For a large number of dimensions, this leads to a prohibitively large number of cubes to materialize.

Next, we need to determine the number of intermediate aggregates needed to calculate all the aggregates for all levels below a specified m, m < n, for m-way associations and for meta-rules with m predicates.

For m=2, the number of aggregates at level 2 is tex2html_wrap_inline750. For level 3 it is tex2html_wrap_inline752. At each level k, the first k-1 literals in the representation are fixed and the others are chosen from the remaining. The total number of aggregates is then tex2html_wrap_inline758. After simplification this becomes tex2html_wrap_inline760. For example, when n = 10, this number is 165, a great reduction from tex2html_wrap_inline764. The savings are even more significant for n = 20, where this number is 1330, much smaller than tex2html_wrap_inline768.

The base cube is n-dimensional. As intermediate aggregates are calculated by traversing the lattice (converted into a DAG by the left schedule) up from level n, the following happens: 1) Dimensionality decreases at each level, 2) number of aggregates increase and 3) sizes of each aggregate decrease.



Sanjay Goil
Fri Aug 7 14:58:04 CDT 1998