Data Warehousing Concept

Download Report

Transcript Data Warehousing Concept

Topics
Data Warehousing Concept
Data Access Technology
Enterprise Real-Time Knowledge
Architecture for Data Warehousing
Data Collection and Delivery
Benson & Parker’s “Square Wheel”
Technology
Environment
Business
Environment
Business
Planning
Business
Operations
M. Anvari
Page 3
Benson & Parker’s “Square Wheel”
Technology
Environment
Business
Environment
Business
Planning
Technology
Planning
Business
Operations
M. Anvari
Technology
Operations
Page 4
Benson & Parker’s “Square Wheel”
Technology
Environment
Business
Environment
Business
Planning
Impact
Organization
Business
Operations
M. Anvari
Technology
Planning
Opportunity
Alignment
Page 5
Technology
Operations
Benson & Parker’s “Square Wheel”
Technology
Environment
Business
Environment
Business
Planning
Impact
Technology
Planning
Information
Technology has to do more than
Organization
Opportunity
just align itself with the business, it has to help
the business
have theAlignment
maximum Technology
impact in the
Business
Operations
Operations
marketplace.
M. Anvari
Page 6
Data Access
and
Delivery System
Technology Evolution
 New classes of computers
 New classes of communications
 New classes of technology (image, sound, video,
multimedia)
 New classes of software
 Much more complex technical environment

Cooperative Processing/Client-Server

Distributed Data Bases

LANs, WANs, etc.
 Obsolescence Problem

M. Anvari
Multiple Legacy Systems
Page 8
IT Impact on Business
Enterprise Network Computing and Client/Server Technology are
changing the way organizations look at all of their information systems
HP
Compaq
DEC
IBM
M. Anvari
Page 9
The Existing Enterprise
 Support Existing Products
 Support Existing Customers
 Support Existing Organization
 Support Existing Workforce
 Support Existing Technology
M. Anvari
Page 10
Controlling the (Global)
Real-time Organization
RTO = 24 x 7 x E
(Where E means every major market)
M. Anvari
Page 11
Information and the Enterprise
Organizational needs for data
Organizational needs for information
Organizational needs for knowledge
M. Anvari
Page 12
Needs for Data
Data = Values (Measurements)
 Data to operate
 Data to control
 Data to plan
M. Anvari
Page 14
Needs for Information
Information = Content + Structure (Relationships)
 Structure of the Real-world
 Relating data to the business

Cross functional processes
 Relating data to the real world

External DB

External Data Feeds (D&B, Reuters, etc.)

Text, Image, Voice, Video, etc.

Statistical Studies
M. Anvari
Page 15
Needs for Knowledge
Knowledge = Goals + Actions + Learning
 Learning more about our business
 Learning more about our market
 Learning more about the business environment
Knowledge is the area in which Data Warehousing and
Data Mining are potentially critical technologies
M. Anvari
Page 16
Data, Information and Knowledge
 Data Centers
 Data Bases
 Information Centers
 Information Bases
 Knowledge Centers
 Knowledge Bases
M. Anvari
Page 17
Old Data Never Dies
60s
70s
80s
90s
Batch
On-line
Minis
PCs
Networking
Enterprise Computing
(Peer to Peer, Network to Network)
Note that none of the early computing styles have
ever gone away!!!
M. Anvari
Page 18
Operational vs. Informational
Systems
Information Access Today
M. Anvari
Page 19
Operational vs. Informational
Systems
Mafg.
Operational
Systems
Ord.
Entry
Information Access Today
M. Anvari
Page 20
Operational vs. Informational
Systems
Operational
Systems
Informational
Systems
Information Access Today
M. Anvari
Page 21
Operational vs. Informational
Systems
Operational
Systems
Estimating
& Analysis
Marketing
Systems
Product
Planning
Informational
Systems
Information Access Today
M. Anvari
Page 22
Operational vs. Informational
Systems
Operational
Systems
Information
Delivery System
Informational
Systems
Information Access Today
M. Anvari
Page 23
Operational vs. Informational
Systems
Operational
Systems
Information
Delivery System
Data Warehousing is fundamentally
an issue of Enterprise Data Informational
Architecture
Systems
Information Access Today
M. Anvari
Page 24
Operational vs. Informational
Systems
Operational
Systems
Information
Delivery System
Informational
Systems
M. Anvari
Page 25
Operational vs. Informational
Systems
Operational
Systems
Data
Information
Delivery System
Warehouse
Informational
Systems
M. Anvari
Page 26
Operational vs. Informational
Systems
Operational
Systems
Data
Information
Delivery System
Warehouse
Data
Marts
Informational
Systems
M. Anvari
Page 27
Operational vs. Informational
Systems
External
Data
Operational
Systems
Data
Information
Delivery System
Warehouse
Informational
Systems
Data
Garages
M. Anvari
Page 28
Operational vs. Informational
Systems
External
Data
Operational
Systems
Data
Information
Delivery System
Warehouse
Informational
Systems
External
Users
M. Anvari
Page 29
End User Evolution
 Data Base Management Systems users
 Ad Hoc Reports users
 Today’s Customer Demands Automated Real-Time
Response.
 End User Systems

Decision Support Systems

Executive Information Systems

Information Centers
M. Anvari
Page 30
Ways to Organize Data

Tables
Flexible, Simple

Hierarchies
Speed, Natural Reporting

Networks
Multiple Directions, Complex Structure

Lists
Updating Complex Structure

Matrices / Array
Manipulate Multiple Dimensions

Inverted Files
Unplanned queries, text retrieval

Objects
Complex structures, hide structure

Multidimensional Data Bases (Data Warehousing)
M. Anvari
Page 31
End User Computing Evolution
Tool or Technique
File Access Systems
Network DB
Hierarchical DB
Inverted File DB
Relational DB
Report Generator
Query Language
4GL
Decision Support System
Executive Information System
Information Center
M. Anvari
Strengths
Physical access to data
Support for complex data interrelations
Support for hierarchical views of data
Support for unplanned inquiry, esp. text
Flexibility and ease of updating
Support for simple ad hoc reporting
Support for simplead hoc inquiries
Ability to develop simple systems easily
Ability to support financial and statistical
data analysis
Ability to present information to
executives
Support for end users trying to access
enterprise information
Page 32
Data Warehousing
Data Warehouse can be thought of as an automated version of the
Information Center that was widely popular in the mid-1980s or
even ultimately as the automation of Information Resource
Management. And while technologies such as client-server have
begun to put enormous computing and graphics power in the
hands of individuals, however, these technologies have not, in
general, provided the link to the operational data that end users
need to make critical business decisions.
M. Anvari
Page 33
Data Warehouse Requirements
Support for Universal Access to Multi-platform Data Bases
Support for Multiple User Types
Separation of Operational and Informational Concerns
Support for Networked Data
Support for Directories, Repositories and Information Models,
Support for Advanced End User Interfaces
M. Anvari
Page 34
Access to Heterogeneous Data
HP
Compaq
DEC
IBM
M. Anvari
Page 35
Multiple User Types (Knowledge workers)








M. Anvari
Top Executives
Managers
Analysts
Planners
Product Developers
Consultants
Lawyers
etc.
Page 36
Separation of Operational and
Informational Concerns
 Operational Systems

Response Time

Reliability

Security

Recoverability
 Informational Systems
M. Anvari

Flexibility, Performance, Ease of Navigation

Large numbers of different views

Manage Huge Amounts of Data (VLDBs)

Need to drill down/drill thru into data

Need to draw on data from many sources
Page 37
Support for Networked Data
All the data that is required to support informational
needs is often not on the same operational data
base. The need for Labor Negotiations, for
example, may come from a variety of operational
data bases, such as Manufacturing, Personnel,
and Accounting.
Distributed Systems
M. Anvari
Page 38
Support for Advanced End User
Interfaces
M. Anvari
Page 39
Dimensions of Data Warehousing
Performance
Security
Connection to
the Operational Data
Ease of
Use
Flexibility
Distributed Data
Quality
Scalability
M. Anvari
Page 40
Enterprise Knowledge Architecture
for
Data Warehousing
M. Anvari
Page 41
Operational vs. Informational
Systems
Operational
Systems
Information
Delivery System
Informational
Systems
M. Anvari
Page 42
Operational vs. Informational
Systems
M. Anvari
Page 43
Enterprise Network Computer
Architecture
External
DBs
Application Messaging
Information
Access
Data
Mart
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 44
Data
Access
Operational
DBs
Data Directory
(Repository)
Freeing the “Data in Jail”
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 45
Data
Access
Operational
DBs
Data Directory
(Repository)
The Information Access Layer
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 46
Data
Access
Operational
DBs
Data Directory
(Repository)
The Legacy Data Layer
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 47
Data
Access
Operational
DBs
Data Directory
(Repository)
The External Data Layer
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 48
Data
Access
Operational
DBs
Data Directory
(Repository)
The Data Access Layer
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 49
Data
Access
Operational
DBs
Data Directory
(Repository)
The Data Access Layer
External
DBs
Application Messaging
Data Access
Filter
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 50
Data
Access
Operational
DBs
Data Directory
(Repository)
The Data Access Layer
External
DBs
Application Messaging
SQL Queries
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 51
Data
Access
Operational
DBs
Data Directory
(Repository)
The Data Access Layer
External
DBs
Application Messaging
SQL Queries
SQL Answers
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 52
Data
Access
Operational
DBs
Data Directory
(Repository)
Application Messaging
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 53
Data
Access
Operational
DBs
Data Directory
(Repository)
The Meta-Data Repository Layer
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 54
Data
Access
Operational
DBs
Data Directory
(Repository)
The Process Management Layer
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 55
Data
Access
Operational
DBs
Data Directory
(Repository)
The Core Data Warehouse
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 56
Data
Access
Operational
DBs
Data Directory
(Repository)
Data Staging and Quality
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 57
Data
Access
Operational
DBs
Data Directory
(Repository)
Data Mart (Post-process/Indexing)
External
DBs
Application Messaging
Information
Access
PostProc.&
Indexing
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 58
Data
Access
Operational
DBs
Data Directory
(Repository)
Goals of Warehouse
1. Performance (Canned queries, MD Analysis, Ad hoc,
Impact on Operational System)
2. Flexibility (MD Flex, Ad hoc, Change data structure)
3. Scalability (No. of Users, Volume of Data)
4. Ease of Use (Location, Formulation, Navigation,
Manipulation)
5. Data Quality (Consistent, Correct, Timely, Integrated)
6. Connection to the Detail Business Transactions
M. Anvari
Page 59
Virtual Warehouse
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 60
Data
Access
Operational
DBs
Data Directory
(Repository)
Virtual Warehouse
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 61
Data
Access
Operational
DBs
Data Directory
(Repository)
Virtual Warehouse
External
DBs
Application Messaging
A Virtual Data Warehouse
approach is often chosen
when there are infrequent
demands for data and
management wants to
determine if/how users
will use operational data.
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 62
Data
Access
Operational
DBs
Data Directory
(Repository)
Virtual Warehouse
Application Messaging
External
DBs
One of the weaknesses of
a Virtual Data Warehouse
approach is that user
queries are made against
operational DBs.
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 63
One way to minimize this
problem is to build a
“Query Monitor” to check
the performance
Data
Operational of a query
characteristics
Access
DBs
before executing it.
Data Directory
(Repository)
Distributed Data Warehouse
External
DBs
Application Messaging
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 64
Data
Access
Operational
DBs
Data Directory
(Repository)
Distributed Data Warehouse
External
DBs
Application Messaging
A Distributed Data
Warehouse is similar in most
respects to a Central Data
Warehouse, except that the
data is distributed to
separate mini-Data
Warehouses (Data Marts )
on local or specialized
servers
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 65
Data
Access
Operational
DBs
Data Directory
(Repository)
Information Access Tools
 Desktop DBs
 Spreadsheets
 4GL/Desktop Query Tools
 Decision Support Systems (DSS)
 Multi-dimensional DBs (MDDs)
 OLAP (On-line Analytical Processing
 Executive Information Systems (EIS)
 Data Visualization Tools
 Data Mining Tools
 Business Modeling and Simulation Tools
M. Anvari
Page 66
Data Warehousing Tools and
Technology
External
DBs
Application Messaging
Desktop Data Bases:
•Structured for Database Manipulation
•Provides facility for selecting, and
loading of Desktop DBs from
Informational DBs
•Provides ability to Create Highly
“Personalized” Informational Systems
Information
Access
Examples
•Access
Data
Data
Data
Access
Staging
Warehouse •Paradox
•dBase/FoxPro/Clipper
Data Directory Functions
Process Management
M. Anvari
Page 67
Operational
DBs
Data Directory
(Repository)
Enterprise Network Computer
Architecture
External
DBs
Application Messaging
Spreadsheets:
•Structured to get any subset of
Information
•Ability to Interface with standard
Spreadsheet tools (
Information
Access
Examples
• Excel
• 1-2-3
• Quatro
Pro
Data
Data
Warehouse
Staging
Data Directory Functions
Process Management
M. Anvari
Page 68
Data
Access
Operational
DBs
Data Directory
(Repository)
Enterprise Network Computer
Architecture
External
DBs
Application Messaging
Ad Hoc Query Systems:
•Tailored for Flexible Reporting
•Ability to do Sophisticated Analysis
Functions
•Aimed a a variety of users from casual to
the power user
Information
Access
M. Anvari
Examples
•Focus for Windows (IBI)
•SASData
Data
Data
Operational
Access
Staging Objects
Warehouse •Business
DBs
•GQL (Anadyne)
•Esperant (Software AG)
Data Directory
Functions & Trees (Platinum)
•Forrest
•Visualizer (IBM)
Process Management
•Impromptu (Cognos)
•Beacon (Prodea)
Page 69
Data Directory
(Repository)
Enterprise Network Computer
Architecture
External
DBs
Application Messaging
Multi-dimensional Databases (MDDB)
OLAP (On-line analytical processing):
•Highly Structured Data
•Tailored for Financial Modeling
•Tailored for “Power Users”
•Ability to do Sophisticated
Financial “What-if” Analysis
•Ability to “drill-down” from high-level to
Detail Data
Information
Access
Data
Data
Warehouse
Examples Staging
Data
Access
Operational
DBs
• Acumate (Kenan Tech.)
• Beacon (Prodea)
Data Directory
Functions
• CrossTarget
(Dimensional Insight)Data Directory
(Repository)
• eSSbase (Arbor)
Process Management
• Oracle Express (Oracle)
M. Anvari
Page 70
Enterprise Network Computer
Architecture
Application Messaging
External
DBs
Executive Information Systems (EIS):
•Highly Structured Data
•Tailored for Non-technical Users
•Ability to “slice and dice” data
•Ability to “drill-down”
Information
Access
Examples
• Commander OLAP Server
• Pilot (Lightship)
• VBData
Data
Data
Operational
Access
Staging
Warehouse • Powerbuilder
DBs
Data Directory Functions
Process Management
M. Anvari
Page 71
Data Directory
(Repository)
Enterprise Network Computer
Architecture
External
DBs
Application Messaging
Data Visualization:
• Automatic Categorization
• Visualization of Multi-dimensional data
• Automatic Analysis and/or Indexing
Information
Access
Examples
• WinViz (IBI)
• dbExpress (Computer Concepts)
• Data
ExplorerData
(IBM)
Data
Data
Operational
Access
Staging
Warehouse • ARC
Info/ARC View DBs
• Strategic Mapping
Data Directory Functions
Process Management
M. Anvari
Page 72
Data Directory
(Repository)
Enterprise Network Computer
Architecture
External
DBs
Application Messaging
Data Mining:
•High Speed Analysis of Detail Data
•Constructs Business Patterns
•Provides Statistical Support
Information
Access
Examples
• IBM beta-test
• Information Harvester
• IDIS
• d.b.Express
Data
Data
Data
Access
Staging
Warehouse • DataMind
Data Directory Functions
Process Management
M. Anvari
Page 73
Operational
DBs
Data Directory
(Repository)
Enterprise Network Computer
Architecture
External
DBs
Application Messaging
Business Modeling and Simulation:
•Business Feedback Model
•Direct Manipulation
•Business Gaming
•Management/Operations Training
Information
Access
Examples
• SimRefinery
• SimTelephone
• iThink
Data
Data
Data
Access
Staging
Warehouse • Microworlds
Data Directory Functions
Process Management
M. Anvari
Page 74
Operational
DBs
Data Directory
(Repository)
3. Meta-data Repository Layer
External
DBs
Application Messaging
Data Dictionary/
Repository
• Meta-data Modeling
• Meta-data Updating
• Meta-data
Information
Access
Examples
o Platinum
o Rochade
o MSP
o Data Atlas (IBM)
Data o MS/TI Data
Staging
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 75
Data
Access
Operational
DBs
Data Directory
(Repository)
3. Process (Systems) Management
External
DBs
Application Messaging
Process
Management
• Scheduling
• Execution
• Subscription
Examples
o Data Harvester
o Data Hub
o Detect and Alert
(Comshare)
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 76
Data
Access
Operational
DBs
Data Directory
(Repository)
3. Post-processing/Indexing Layer
External
DBs
Application Messaging
Post-processing/
Indexing
Examples
•Sybase IQ Accelerator
•OMNIdex
•Oracle 7.3
•eSSbase
•IRI Express
Information
Access
Data
Staging
Data
Warehouse
Data Directory Functions
Process Management
M. Anvari
Page 77
Data
Access
Operational
DBs
Data Directory
(Repository)