Transcript Chapter 13

13
Chapter 13
The Data Warehouse
Database Systems:
Design, Implementation, and Management,
Seventh Edition, Rob and Coronel
1
13
In this chapter, you will learn:
• How operational data and decision support
data differ
• What a data warehouse is, how data for it are
prepared, and how it is implemented
• What star schemas are and how they are
constructed
• What data mining is and what role it plays in
decision support
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
2
13
In this chapter, you will learn (continued):
• What online analytical processing (OLAP) is
• How SQL extensions are used to support
OLAP-type data manipulations
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
3
13
The Need for Data Analysis
• Managers must be able to track daily
transactions to evaluate how the business is
performing
• By tapping into operational database,
management can develop strategies to meet
organizational goals
• Data analysis can provide information about
short-term tactical evaluations and strategies
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
4
13
The Need for Data Analysis (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
5
13
The Need for Data Analysis (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
6
13
Decision Support Systems
• Decision support is methodology (or series of
methodologies) designed to extract information from
data and to use such information as a basis for
decision making
• Decision support system (DSS)
– Arrangement of computerized tools used to assist
managerial decision making within business
– Usually requires extensive data “massaging” to
produce information
– Used at all levels within organization
– Often tailored to focus on specific business areas
– Provides ad hoc query tools to retrieve data and to
display data in different formats
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
7
13
Decision Support Systems (continued)
• Composed of following four main components:
– Data store component
• Basically a DSS database
– Data extraction and data filtering component
• Used to extract and validate data taken from operational
database and external data sources
– End-user query tool
• Used to create queries that access database
– End-user presentation tool
• Used to organize and present data
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
8
13
Decision Support Systems (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
9
13
Operational Data vs.
Decision Support Data
• Operational Data
– Mostly stored in relational database
– Optimized to support transactions representing daily
operations
• DSS Data
– Give tactical and strategic business meaning to
operational data
– Differs from operational data in following three main
areas:
• Timespan
• Granularity
• Dimensionality
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
10
13
Operational Data vs.
Decision Support Data (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
11
13
Operational Data vs. Decision Support
Data (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
12
13
DSS Database Requirements
• A specialized DBMS tailored to provide fast
answers to complex queries.
• Four main requirements:
–
–
–
–
Database schema
Data extraction and loading
End-user analytical interface
Database size
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
13
13
DSS Database Requirements (continued)
• Database schema
– Must support complex data representations
– Must contain aggregated and summarized
data
– Queries must be able to extract
multidimensional time slices
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
14
13
DSS Database Requirements (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
15
13
DSS Database Requirements (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
16
13
DSS Database Requirements (continued)
• Data extraction
– Should allow batch and scheduled data
extraction
– Should support different data sources
• Flat files
• Hierarchical, network, and relational databases
• Multiple vendors
• Data filtering
– Must allow checking for inconsistent data
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
17
13
DSS Database Requirements (continued)
• End-user analytical interface
– One of most critical DSS DBMS components
– Permits user to navigate through data to
simplify and accelerate decision-making
process
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
18
13
DSS Database Requirements (continued)
• Database size
– In 2005, Wal-Mart had 260 terabytes of data in
its data warehouses
– DBMS must support very large databases
(VLDBs)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
19
13
The Data Warehouse
• Integrated, subject-oriented, time-variant,
nonvolatile collection of data that provides
support for decision making
• Usually a read-only database optimized for
data analysis and query processing
• Requires time, money, and considerable
managerial effort to create
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
20
13
The Data Warehouse (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
21
13
The Data Warehouse (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
22
13
The Data Warehouse (continued)
• Data mart
– Small, single-subject data warehouse subset
– Each is more manageable data set than data
warehouse
– Provides decision support to small group of
people
– Typically lower cost and lower implementation
time than data warehouse
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
23
13
DSS Architectural Styles
• Several are available
• Provide advanced decision support features
• Some capable of providing access to
multidimensional data analysis
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
24
13
DSS Architectural Styles (continued)
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
25
13
Twelve Rules that Define
a Data Warehouse
• Data warehouse and operational environments are
separated
• Data warehouse data are integrated
• Data warehouse contains historical data over long
time horizon
• Data warehouse data are snapshot data captured at
given point in time
• Data warehouse data are subject oriented
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
26
13
Twelve Rules that Define
a Data Warehouse (continued)
• Data warehouse data are mainly read-only with
periodic batch updates from operational data
– No online updates allowed
• Data warehouse development life cycle differs from
classical systems development
• Data warehouse contains data with several levels of
detail: current detail data, old detail data, lightly
summarized data, and highly summarized data
• Data warehouse environment is characterized by
read-only transactions to very large data sets
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
27
13
Twelve Rules that Define
a Data Warehouse (continued)
• Data warehouse environment has system that traces
data sources, transformations, and storage
• Data warehouse’s metadata are critical component of
this environment
• Data warehouse contains chargeback mechanism for
resource usage that enforces optimal use of data by
end users
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
28
13
Online Analytical Processing
• Advanced data analysis environment that
supports decision making, business modeling,
and operations research
• OLAP systems share four main characteristics:
–
–
–
–
Use multidimensional data analysis techniques
Provide advanced database support
Provide easy-to-use end-user interfaces
Support client/server architecture
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
29
13
Multidimensional Data Analysis
Techniques
• Data are processed and viewed as part of a
multidimensional structure
• Particularly attractive to business decision
makers
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
30
13
Multidimensional Data Analysis
Techniques (continued)
• Augmented by following functions:
– Advanced data presentation functions
– Advanced data aggregation, consolidation and
classification functions
– Advanced computational functions
– Advanced data modeling functions
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel
31