Data Warehouse & Data Mining
Download
Report
Transcript Data Warehouse & Data Mining
12
Chapter 12
Data Warehouse & Data Mining
Database Systems:
Design, Implementation, and Management,
Sixth Edition, Rob and Coronel
1
12
In this chapter, you will learn:
• How operational data and decision support
data differ
• What a data warehouse is and how data for it
are prepared
• What star schemas are and how they are
constructed
• What steps are required to successfully
implement a data warehouse
• What data mining is and what role it plays in
decision support
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
2
12
The Need for Data Analysis
• Managers must be able to track daily
transactions to evaluate how the business is
performing
• By tapping into the operational database,
management can develop strategies to meet
organizational goals
• Data analysis can provide information about
short-term tactical evaluations and strategies
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
3
12
Solving Business Problems and Adding
Value with Data Warehouse-Based Solutions
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
4
12
Solving Business Problems and Adding Value
with Data Warehouse-Based Solutions (continued)
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
5
12
Decision Support Systems
• Methodology (or series of methodologies)
designed to extract information from data and to
use such information as a basis for decision
making
• Decision support system (DSS):
– Arrangement of computerized tools used to assist
managerial decision making within a business
– Usually requires extensive data “massaging” to
produce information
– Used at all levels within an organization
– Often tailored to focus on specific business areas
– Provides ad hoc query tools to retrieve data and
to display data in different formats
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
6
12
Decision Support Systems (continued)
• Composed of four main components:
– Data store component
• Basically a DSS database
– Data extraction and filtering component
• Used to extract and validate data taken from
operational database and external data sources
– End-user query tool
• Used to create queries that access database
– End-user presentation tool
• Used to organize and present data
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
7
12
Main Components of a
Decision Support System (DSS)
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
8
12
Transforming Operational Data Into
Decision Support Data
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
9
12
Contrasting Operational and DSS Data
Characteristics
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
10
12
Ten-Year Sales History for a Single
Department, in Millions of Dollars
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
11
12
Yearly Sales Summaries, Two Stores and Two
Departments per Store,
in Millions of Dollars
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
12
12
The Data Warehouse
• Integrated, subject-oriented, time-variant,
nonvolatile database that provides support for
decision making
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
13
12
A Comparison of Data Warehouse and
Operational Database Characteristics
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
14
12
Creating a Data Warehouse
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
15
12
Online Analytical Processing
• Advanced data analysis environment that
supports decision making, business
modeling, and operations research
• OLAP systems share four main
characteristics:
– Use multidimensional data analysis
techniques
– Provide advanced database support
– Provide easy-to-use end-user interfaces
– Support client/server architecture
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
16
12
Operational vs. Multidimensional
View of Sales
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
17
12
Integration of OLAP
with a Spreadsheet Program
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
18
12
OLAP Client/Server Architecture
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
19
12
OLAP Server Arrangement
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
20
12
OLAP Server with Multidimensional Data
Store Arrangement
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
21
12
OLAP Server With Local Mini Data Marts
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
22
12
Bitmap Representation of REGION
Values
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
23
12
Typical ROLAP Client/Server Architecture
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
24
12
MOLAP Client/Server Architecture
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
25
12
Relational vs. Multidimensional OLAP
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
26
12
Star Schemas
• Data modeling technique used to map
multidimensional decision support data into a
relational database
• Creates the near equivalent of a
multidimensional database schema from the
existing relational database
• Yield an easily implemented model for
multidimensional data analysis, while still
preserving the relational structures on which
the operational database is built
• Has four components: facts, dimensions,
attributes, and attribute hierarchies
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
27
12
Simple Star Schema
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
28
12
Possible Attributes for Sales Dimensions
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
29
12
Three-Dimensional View of Sales
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
30
12
Slice and Dice View of Sales
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
31
12
Location Attribute Hierarchy
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
32
12
Attribute Hierarchies In Multidimensional
Analysis
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
33
12
Star Schema for Sales
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
34
12
Orders Star Schema
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
35
12
Normalized Dimension tables
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
36
12
Multiple Fact Tables
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
37
12
Implementing a Data Warehouse
• Numerous constraints:
– Available funding
– Management’s view of the role played by an
IS department and of the extent and depth of
the information requirements
– Corporate culture
• No single formula can describe perfect data
warehouse development
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
38
12
Factors Common to Data Warehousing
• Data warehouse is not a static database
• Dynamic framework for decision support that
is always a work in progress
• Data warehouse data cross departmental
lines and geographical boundaries
• Must satisfy:
– Data integration and loading criteria
– Data analysis capabilities with acceptable
query performance
– End-user data analysis needs
• Apply database design procedures
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
39
12
Data Warehouse
Implementation Road Map
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
40
12
Data Mining
• Tools that:
– analyze data
– uncover problems or opportunities hidden in
data relationships,
– form computer models based on their findings,
and then
– use the models to predict business behavior
• Require minimal end-user intervention
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
41
12
Extraction of Knowledge From Data
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
42
12
Data-Mining Phases
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
43
12
A Sample of Current Data Warehousing
and Data-Mining Vendors
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
44
12
Summary
• Data analysis is used to derive and interpret
information from data
• Decision support is a methodology designed to
extract information from data and to use such
information as a basis for decision making
• Decision support system is an arrangement of
computerized tools used to assist managerial
decision making within a business
• Data warehouse is an integrated, subjectoriented, time-variant, nonvolatile database that
provides support for decision making
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
45
12
Summary (continued)
• Online analytical processing is an advanced
data analysis environment that supports
decision making, business modeling, and
operations research
• Star schema is a data-modeling technique used
to map multidimensional decision support data
into a relational database
• The implementation of any company-wide
information system is subject to conflicting
organizational and behavioral factors
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
46
12
Summary (continued)
• Data mining automates analysis of operational
data with the intention of finding previously
unknown data characteristics, relationships,
dependencies, and/or trends
• Data warehouse is storage location for decision
support data
Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
47