Module 1: Introduction to Data Warehousing

Download Report

Transcript Module 1: Introduction to Data Warehousing

DW-1: Introduction to
Data Warehousing
Overview

What is Database

What Is Data Warehousing

Data Marts and Data Warehouses

The Data Warehousing Process

Data in a Data Warehouse
 What Is Database

Before


Now


Program = Algorithm + Data Structure
Application (Weblication) = Visual I/F + SQL Query +
Database
Database is Integrated Data


from multiple file system data for OLTP
Data Base (From Air Base?), DB, 데이타베이스,
자료기지(북한)
Database and Data Model

Computer Representation of Data for efficient
understanding and processing

Data Model based on Relationship modeling

Relationship between record
one-to-one(1:1), one-to-many(1:N), many-to-many(N:M)

Hierarhical Model: Hierarchical Relationship, 1:N

Network model: Network like relationship, N:M

Relational Model: Use relation (table) for Relationship

Object-Oriented data model: Complex object modeling
SET type, Reference, List
 What Is Data Warehousing

Defining Data Warehousing

Operational Systems: A Transactional Solution

Analytical Systems: A Data Warehousing Solution

Comparing Transactional and Data Warehousing
Solutions
Defining Data Warehousing

Business Intelligence




Database Marketing: Personalized Product
Especially S/W, Cocoon business etc.
Electronic Commerce
Data Warehouse: 자료 창고

for OLAP, Data Mining, DSS

Knowledge Management
Data Warehousing: Process to build Data Warehouse
Defining Data Warehousing


A Data Warehouse Is a Database That Contains:

Enterprise data

Integrated sets of historical data

Subject-oriented, consolidated, consistent data

Data structured for distribution and querying
A Data Warehousing Solution Is a Process That:

Retrieves and transforms data

Manages the database

Uses tools for building and managing the data warehouse
Operational Systems: A Transactional Solution

Track Individual Events

Used for Real-time Data Entry and Editing

Examples:

Order-tracking applications

Customer service applications

Point-of-sale applications

Service-based sales applications

Banking functions
Analytical Systems: A Data Warehousing Solution

Assist with Strategic Decision Support

Provide Different Levels of Analysis

Allow Users to Navigate to Different Levels of Data

Allow System Searches to Find New Relationships

Examples:

Spreadsheet-based applications

Sales forecasting applications
Comparing Transactional and Data Warehousing
Solutions
Transactional
solutions
Data warehousing
solutions
Update frequency Real-time
Periodically
Structured for
Data integrity
Ease in querying
Optimized for
Transaction performance
Query performance
 Data Marts and Data Warehouses

What Is a Data Mart

Moving Data from a Data Warehouse to Data Marts

Moving Data from Data Marts to a Data Warehouse
What Is a Data Mart



What Is a Data Mart

A subset of a data warehouse

Used in an enterprise

Specific to a particular subject or business activity
Why Build Data Marts

Faster queries and fewer users

Faster deployment time
Integrated Data Marts

Ensure consistent data

Require advance planning
Moving Data From a Data Warehouse to Data Marts
Sales Mart
Source 1
Source 2
Data
Warehouse
Source 3


Advantages

Shared fields

Common source

Distributed processing
Disadvantages

Longer time to develop
Financial Mart
Customer
Service Mart
Moving Data from Data Marts to a Data Warehouse
Source 1
Sales Mart
Source 2
Financial Mart
Source 3


Advantages
Customer Service
Mart

Simpler and faster to implement

Department-specific data

Smaller hardware requirements
Disadvantages

Data duplication

Incompatible data marts
Data
Warehouse
The Data Warehousing Process

Basic Elements of the Process

Tools to Manage the Process
Basic Elements of the Process
Source OLTP
Systems
Data Marts
Clients
Data
Warehouse
1
Retrieve Data
2 Transform Data
3
Populate
Data Warehouse
4
Populate
Data Marts
5
Query
the Data
Tools to Manage the Process

SQL Server

Data Transformation Services

SQL Server OLAP Services

Microsoft Repository

Microsoft English Query

PivotTable Service
ETL process

Extraction, Transformation, Loading

Extraction: 추출


Transformation: 변환


Data modification, sorting, calculation etc
Loading: 적재


Data retrieval from existing data source such as File,
Table etc.
Bulk, incremental loading from operational DB
Time consuming process: may use special H/W
Data in a Data Warehouse

Data Characteristics

Example of Organizing Data
Data Characteristics
Data characteristic
Description
Consolidated
Enterprise-wide
Consistent
Within the data warehouse
Subject-oriented
Organized to user perspective
Historical
Snapshots over time
Read-only
Cannot update
Summarized
To appropriate level of detail
Example of Organizing Data
Monthly Southeast Regional Sales Report - May 1999
State
City
Units Sold
Sales $
FL
Miami
2,500
$12,850
FL
Tampa
2,750
$14,135
5,250
$26,985
FL Totals
GA
Atlanta
3,200
$16,800
GA
Savannah
1,725
$ 9,143
4,925
$25,943
1,900
$ 9,595
1,900
$ 9,595
12,075
$62,473
GA Totals
SC
Columbia
SC Totals
Southeast Region
Total
Data Warehouse Schema Example: Star schema
A Example of Cube Browsing

1 Fact with 4 Dimension Table
-- Sales_Fact, Product, Store, Time, Customer
Drilling Down

Drilling Down to products
Drilling Down

Drilling Down to the lowest level of Customer Dimension
Rolling up

Rolling up
Review

What Is Data Warehousing

Data Marts and Data Warehouses

The Data Warehousing Process

Data in a Data Warehouse

Data Warehouse will be more popular than DB?