Module 1: Introduction to Data Warehousing
Download
Report
Transcript Module 1: Introduction to Data Warehousing
DW-1: Introduction to
Data Warehousing
Overview
What is Database
What Is Data Warehousing
Data Marts and Data Warehouses
The Data Warehousing Process
Data in a Data Warehouse
What Is Database
Before
Now
Program = Algorithm + Data Structure
Application (Weblication) = Visual I/F + SQL Query +
Database
Database is Integrated Data
from multiple file system data for OLTP
Data Base (From Air Base?), DB, 데이타베이스,
자료기지(북한)
Database and Data Model
Computer Representation of Data for efficient
understanding and processing
Data Model based on Relationship modeling
Relationship between record
one-to-one(1:1), one-to-many(1:N), many-to-many(N:M)
Hierarhical Model: Hierarchical Relationship, 1:N
Network model: Network like relationship, N:M
Relational Model: Use relation (table) for Relationship
Object-Oriented data model: Complex object modeling
SET type, Reference, List
What Is Data Warehousing
Defining Data Warehousing
Operational Systems: A Transactional Solution
Analytical Systems: A Data Warehousing Solution
Comparing Transactional and Data Warehousing
Solutions
Defining Data Warehousing
Business Intelligence
Database Marketing: Personalized Product
Especially S/W, Cocoon business etc.
Electronic Commerce
Data Warehouse: 자료 창고
for OLAP, Data Mining, DSS
Knowledge Management
Data Warehousing: Process to build Data Warehouse
Defining Data Warehousing
A Data Warehouse Is a Database That Contains:
Enterprise data
Integrated sets of historical data
Subject-oriented, consolidated, consistent data
Data structured for distribution and querying
A Data Warehousing Solution Is a Process That:
Retrieves and transforms data
Manages the database
Uses tools for building and managing the data warehouse
Operational Systems: A Transactional Solution
Track Individual Events
Used for Real-time Data Entry and Editing
Examples:
Order-tracking applications
Customer service applications
Point-of-sale applications
Service-based sales applications
Banking functions
Analytical Systems: A Data Warehousing Solution
Assist with Strategic Decision Support
Provide Different Levels of Analysis
Allow Users to Navigate to Different Levels of Data
Allow System Searches to Find New Relationships
Examples:
Spreadsheet-based applications
Sales forecasting applications
Comparing Transactional and Data Warehousing
Solutions
Transactional
solutions
Data warehousing
solutions
Update frequency Real-time
Periodically
Structured for
Data integrity
Ease in querying
Optimized for
Transaction performance
Query performance
Data Marts and Data Warehouses
What Is a Data Mart
Moving Data from a Data Warehouse to Data Marts
Moving Data from Data Marts to a Data Warehouse
What Is a Data Mart
What Is a Data Mart
A subset of a data warehouse
Used in an enterprise
Specific to a particular subject or business activity
Why Build Data Marts
Faster queries and fewer users
Faster deployment time
Integrated Data Marts
Ensure consistent data
Require advance planning
Moving Data From a Data Warehouse to Data Marts
Sales Mart
Source 1
Source 2
Data
Warehouse
Source 3
Advantages
Shared fields
Common source
Distributed processing
Disadvantages
Longer time to develop
Financial Mart
Customer
Service Mart
Moving Data from Data Marts to a Data Warehouse
Source 1
Sales Mart
Source 2
Financial Mart
Source 3
Advantages
Customer Service
Mart
Simpler and faster to implement
Department-specific data
Smaller hardware requirements
Disadvantages
Data duplication
Incompatible data marts
Data
Warehouse
The Data Warehousing Process
Basic Elements of the Process
Tools to Manage the Process
Basic Elements of the Process
Source OLTP
Systems
Data Marts
Clients
Data
Warehouse
1
Retrieve Data
2 Transform Data
3
Populate
Data Warehouse
4
Populate
Data Marts
5
Query
the Data
Tools to Manage the Process
SQL Server
Data Transformation Services
SQL Server OLAP Services
Microsoft Repository
Microsoft English Query
PivotTable Service
ETL process
Extraction, Transformation, Loading
Extraction: 추출
Transformation: 변환
Data modification, sorting, calculation etc
Loading: 적재
Data retrieval from existing data source such as File,
Table etc.
Bulk, incremental loading from operational DB
Time consuming process: may use special H/W
Data in a Data Warehouse
Data Characteristics
Example of Organizing Data
Data Characteristics
Data characteristic
Description
Consolidated
Enterprise-wide
Consistent
Within the data warehouse
Subject-oriented
Organized to user perspective
Historical
Snapshots over time
Read-only
Cannot update
Summarized
To appropriate level of detail
Example of Organizing Data
Monthly Southeast Regional Sales Report - May 1999
State
City
Units Sold
Sales $
FL
Miami
2,500
$12,850
FL
Tampa
2,750
$14,135
5,250
$26,985
FL Totals
GA
Atlanta
3,200
$16,800
GA
Savannah
1,725
$ 9,143
4,925
$25,943
1,900
$ 9,595
1,900
$ 9,595
12,075
$62,473
GA Totals
SC
Columbia
SC Totals
Southeast Region
Total
Data Warehouse Schema Example: Star schema
A Example of Cube Browsing
1 Fact with 4 Dimension Table
-- Sales_Fact, Product, Store, Time, Customer
Drilling Down
Drilling Down to products
Drilling Down
Drilling Down to the lowest level of Customer Dimension
Rolling up
Rolling up
Review
What Is Data Warehousing
Data Marts and Data Warehouses
The Data Warehousing Process
Data in a Data Warehouse
Data Warehouse will be more popular than DB?