Data Mining - Lyle School of Engineering
Download
Report
Transcript Data Mining - Lyle School of Engineering
CSE 5331/7331
Fall 2007
Dimensional Modeling
Margaret H. Dunham
Department of Computer Science and Engineering
Southern Methodist University
Some slides extracted from Data Mining, Introductory and Advanced Topics, Prentice Hall, 2002.
CSE 5331/7331 F'07
1
Dimensional Modeling
View data in a hierarchical manner more as
business executives might
Useful in decision support systems and mining
Dimension: collection of logically related
attributes; axis for modeling data.
Facts: data stored
Ex: Dimensions – products, locations, date
Facts – quantity, unit price
CSE 5331/7331 F'07
2
Multidimensional Model Example
Fig 2 [1]
CSE 5331/7331 F'07
3
Cube view of Data
Fig 4 [1]
CSE 5331/7331 F'07
4
Aggregation Hierarchies
CSE 5331/7331 F'07
5
Multidimensional Schemas
Star Schema shows facts and dimensions
– Center of the star has facts shown in fact tables
– Outside of the facts, each diemnsion is shown
separately in dimension tables
– Access to fact table from dimension table via join
SELECT Quantity, Price
FROM Facts, Location
Where (Facts.LocationID = Location.LocationID) and
(Location.City = ‘Dallas’)
– View as relations, problem volume of data and
indexing
CSE 5331/7331 F'07
6
Star Schema
CSE 5331/7331 F'07
7
Flattened Star
CSE 5331/7331 F'07
8
Normalized Star
CSE 5331/7331 F'07
9
Snowflake Schema
CSE 5331/7331 F'07
10
OLAP Introduction
OLAP by Example
http://perso.orange.fr/bernard.lupin/englis
h/index.htm
What is OLAP?
http://www.olapreport.com/fasmi.htm
CSE 5331/7331 F'07
11
OLAP
Online Analytic Processing (OLAP): provides more
complex queries than OLTP.
OnLine Transaction Processing (OLTP): traditional
database/transaction processing.
Dimensional data; cube view
Support ad hoc querying
Require analysis of data
Can be thought of as an extension of some of the basic
aggregation functions available in SQL
OLAP tools may be used in DSS systems
Mutlidimentional view is fundamental
CSE 5331/7331 F'07
12
OLAP Implementations
MOLAP (Multidimensional OLAP)
– Multidimential Database (MDD)
– Specialized DBMS and software system capable of
supporting the multidimensional data directly
– Data stored as an n-dimensional array (cube)
– Indexes used to speed up processing
ROLAP (Relational OLAP)
– Data stored in a relational database
– ROLAP server (middleware) creates the
multidimensional view for the user
– Less Complex; Less efficient
HOLAP (Hybrid OLAP)
– Not updated frequently – MDD
– Updated frequently - RDB
CSE 5331/7331 F'07
13
OLAP Operations
Roll Up
Drill Down
Single Cell
CSE 5331/7331 F'07
Multiple Cells
Slice
Dice
14
OLAP Operations
Simple query – single cell in the cube
Slice – Look at a subcube to get more
specific information
Dice – Rotate cube to look at another
dimension
Roll Up – Dimension Reduction; Aggregation
Drill Down
Visualization: These operations allow the
OLAP users to actually “see” results of an
operation.
CSE 5331/7331 F'07
15
Relationship Between Topcs
CSE 5331/7331 F'07
16
Decision Support Systems
Tools and computer systems that assist
management in decision making
What if types of questions
High level decisions
Data warehouse – data which supports
DSS
CSE 5331/7331 F'07
17
Starflake
Fig 2 [4]
CSE 5331/7331 F'07
18
Hierarchy of Data Cubes
Fig 4 [4]
CSE 5331/7331 F'07
19
Unified Dimensional Model
Microsoft Cube View
SQL Server 2005
http://msdn2.microsoft.com/enus/library/ms345143.aspx
http://cwebbbi.spaces.live.com/Blog/cns!1pi7ET
ChsJ1un_2s41jm9Iyg!325.entry
MDX AS2005
http://msdn2.microsoft.com/enus/library/aa216767(SQL.80).aspx
CSE 5331/7331 F'07
20
Bibliography
[1] Anne-Muriel Arigon, Anne Tchounikine, and Maryvonne Miquel, “Handling
Multiple Points of View in a Multimedia Data Warehouse,” ACM Transactions on
Multimedia Computing, Communications and Applications, Vol. 2, No. 3, August
2006, Pages 199–218.
[2] S. Nicholson, “The Bibliomining Process: Data Warehousing and Data Mining
for Library Decision-Making,” Information Technology and Libraries, 22(4),
2003.
[3] S. Nicholson, “The Basis for Biliomining: Frameworks for Bringing Together
Usage-Based Data Mining and Bibliometrics through Data Warehousing in
Digital Library Services,” Information Processing & Management, 42(3), May
2006, pp 785-804.
[4] Jane You, Tharam Dillon, James Liu, Edwige Pissaloux, “On Hierarchical
Multimedia Information Retrieval,” You, J.; Proceedings of the 2001
International Conference on Image Processing, 7-10 Oct 2001, pp 729 – 732.
[5] Torsten Priebe and Gunther Pernul, “Ontology-based Integration of OLAP and
Information Retrieval,” Proceedings of the 14th International Workshop on
Database and expert Systems Applications, 2003.
CSE 5331/7331 F'07
21