Iceberg cubes

Download Report

Transcript Iceberg cubes

Closed and Iceberg Cubes
Reduction necessity
• Data cube produces large outputs
– 1,015,367 tuples (39MB)
– 210,343,580 tuples (8GB)(200 times)
• Two methods to reduce outputs
– Iceberg cube
– Closed cube
Closed Iceberg cube
Cells and Measures
• Cell
– In an n-dimension data cube, a cell c =
(a1,a2,…,an: m) (where m is a measure) is
called a k-dimensional group-by cell, if and
only if there are exactly k (k<=n) values
among {a1,a2,…,an} which are not * (i.e., all).
– Further denote M(c) = m and V(c) =
(a1,a2,…,an).
Notion of Cover
• Cover
– Given two cells c = (a1,a2,…,an:m) and c’ =
(a1’,a2’,…,an’:m’), we denote V(c)<= V(c’) if
for each ai (i = 1,…,n) which is not *, ai’ = ai.
– A cell c is said to be covered by another cell c’
if c’’ such that V(c)<=V (c’’)<=V (c’), M(c’’) =
M(c’).
Closed and iceberg cells
• Closed cell
– A cell is called a closed cell if it is not covered
by any other cells.
• Closed Iceberg cell
– Closed cell which satisfies the iceberg
constraints
Closed and iceberg cell contd
• Let the measure be count, and the iceberg constraint be
count>=2.
• Cell1 = (a1,b1,c1,*: 2), and cell2 = (a1,*, *, * : 3) are
closed iceberg cells;
• Cell3 = (a1,*, c1,* : 2) and cell4 = (a1, b2, c2, d2 : 1)are
not, because the former is covered by cell1, where as
the latter does not satisfy the iceberg constraint.
Methods of computation
• Top-down method
• Bottom-up method
• These methods of computing the cubes
such as BUC, Multi array aggregation and
Star Cubing shall be explained in detail in
the next resource in this module.