Data Warehousing - The Nargundkar Web Site

Download Report

Transcript Data Warehousing - The Nargundkar Web Site

Data Warehousing
An Overview
Outline
•
•
•
•
What is Data Warehousing? (Definition)
Why does anyone need it? (Applications)
How is the data organized? (Star Schema)
Implementation Issues.
Data Warehouse Definitions
• Dyche’: Used for decision making- duplicates
existing data - Combination of hardware,
specialized software and data extracted from
other corporate systems.
• Inmon: Subject-oriented, integrated, nonvolatile and time-variant collection of data in
support of management decisions.
Why Warehouse?
• Provide single view of customers across
enterprise
• Improve turnaround time for common
reports
• Monitor customer behavior
• Predict future purchases
• Improved responsiveness Business issues.
Coca Cola & IBM
• IBM helping Coca Cola with warehouse.
• Deal with Global companies like
McDonalds – support for negotiating global
contracts.
Financial Services Example –
Credit Life Cycle
Product
Planning
Customer
Acquisition
Collections
Customer
Management
Customer Acquisition
Product
Planning
Support for Marketing
• Market Segmentation
Plus Forecasts with:
• Response Models
• Risk / Bankruptcy Models
• Profitability Models
Customer
Acquisition
Customer Management
Who gets a credit increase?
Which of delinquent customers is likely to default?
What do you do (call, send letter, do nothing?)
Decision Support:
Forecast Customer Behavior
(Behavior Models)
Customer
Management
Customer
Acquisition
Collections/Recovery
What is the likelihood of recovering money from an
account sent to collections?
Collections
Decision Support:
Collections models
Customer
Management
Other Questions
• How can we reduce attrition?
• How can we activate inactive accounts?
• How well are my current strategies
performing?
• How do we detect Fraud?
Where is the data?
•
•
•
•
Transaction Systems
Marketing Database
Credit Reports
Customer Service
How is it Organized?
•
•
•
•
Separate from transactional data
Contains Historical data
Generally aggregated to some extent
Optimized for flexible querying of large
volumes of data
Star Schema
• Fact Table plus several dimensional tables
• Un-normalized
• Less flexible than normalized tables
• Faster retrieval than normalized tables for
large volumes of data
Implementation
•
•
•
•
Start with the Business Issues
Project Planning/Human Resources
Database design / data sources
Application Development
Business Analysis
• What is the problem?
• Who owns the problem?
• Will data help solve it?
When can data be used to Predict?
High
Low
Chaotic Markets
(fashion driven)
Real-Time Markets
(Stock Market)
Linear Markets
(Local authority - #
of trash cans)
Statistical Markets
(retail)
Low
High
Randomness
Source: www.butlergroup.com
Also read article in Wired Magazine on Data Mining and Terrorism