Transcript sigmod02
Cube Explorer:
Online Exploration of Data Cubes
Jiawei Han, Jianyong Wang, Guozhu Dong,
Jian Pei, Ke Wang
Mining Guided Cube Explorer
Novel Algorithms and Methods:
Faster Creation of Iceberg Cube
Predictive Gradient Analysis
Multi-dimensional Gradient Mining
Association and Sequence Cube Analysis
Integrated with commercial software such as Microsoft
OLE DB for DM, OLAP, RDMS, and DBMiner
2
Cube Explorer Benefits
For Users:
Superior performance and scalability
Saving analysis cost and time –reusable mining
queries, work directly on OLAP and relational data,
Easy to use – SQL like mining, integrated with data
sources
Leverage OLAP & data warehouse engines–
versatile functionality and strong synergy
3
Iceberg Cube Exploration Demo
Novel
H-Tree Iceberg Cube Creation
Cube Computation with Complex Measures
Dataset: large retail POS transaction data
4
Iceberg Cube Exploration Results
3D Visualization (Scatter Plot)
5
Gradient Mining Issues
1: “What products sold with ‘TV’ will significantly
change profits of ‘TV’ ?”
Answer:
-TV profit is up 10% when sold with DVD
-TV profit is down 5% when sold with VCR
2: “What are changes of housing price in Big City in
2001 comparing against 2000?”
Answer:
-downtown apartments go up 15% while houses in
suburb go down 5%
6
How to Mine Meaningful Changes?
1 Naïve and manual method
Compute two sub-cubes
Big City housing in 2000
Big City housing in 2001
Tremendous costs
Space
Time
2 Innovation
Only interesting changes wanted “gradient constraint” to
capture and predict significant changes automatically
7
Gradient Mining Prediction Demo
What products sold with ‘Muffins’ will change Sales
of ‘Muffins’?
Select ‘Muffins’ as promotion Itemset, Sales average as Measure:
8
Gradient Mining: Results 1
Most profitable patterns (Ratio >1)
Rule #1: cereal increases ‘muffins” avg. sales by 8%
9
Gradient Mining: Results 2
Least profitable patterns (Ratio <1)
Rule #1: Ice Cream reduces ‘muffins” avg. sales by 4%
10
Gradient Mining: Visualization
Results plotted using 3D bar graph
11