next up previous
Next: Partial cubes Up: Implementation and Optimizations Previous: Implementation and Optimizations

Data Partitioning

Data is distributed on processors to distribute work equitably. In addition, a partitioning scheme for multidimensional has to be dimension-aware and for dimension-oriented operations have some regularity in the distribution. A dimension, or a combination of dimensions can be distributed. In order to achieve sufficient parallelism, it would be required that the product of cardinalities of the distributed dimensions be much larger than the number of processors. For example, for 5 dimensional data (ABCDE), a 1D distribution will partition A and a 2D distribution will partition AB. We assume, that dimensions are available that have cardinalities much greater than the number of processors in both cases. That is, either tex2html_wrap_inline668 for some i, or tex2html_wrap_inline672 for some i, j, tex2html_wrap_inline676, n is the number of dimensions. Partitioning determines the communication requirements for data movement in the intermediate aggregate calculations in the data cube. Figure 2 illustrates 1D and 2D partitions on a 3-dimensional data set on 4 processors.

 
  figure56
Figure 2: 1D and 2D partition for 3 dimensions on 4 processors



Sanjay Goil
Fri Aug 7 14:58:04 CDT 1998