Presentation on What Data Warehousing Is, and the
Download
Report
Transcript Presentation on What Data Warehousing Is, and the
Data Warehousing: An Overview
Katherine Watier
December 3, 2001
MGMT 257-01- Katherine Watier
1
What is a Data warehouse?
It allows business to ask complex questions
about data that is stored in separate systems.
The business user can create reports and
graphs to display this data
The user can look at data summaries across
time or other detailed attributes
It supports data mining which recognized
patterns in the data and alerts managers to
future trends.
MGMT 257-01- Katherine Watier
2
Benefits of Data Warehousing
Benefit # 1:
Users can ask questions like:
“Why were sales up in the Pacific Region
but not in the Mountain Region?”
or
“Why are we over budget”
MGMT 257-01- Katherine Watier
3
Benefits of Data Warehousing
Benefit #2:
Using Data Mining, the system can
recognize patterns within data
and
predict future behaviors based on current
characteristics
MGMT 257-01- Katherine Watier
4
Benefits of Data Warehousing
Benefit # 3
Increased organizational knowledgeenhancing competitive advantage
“The common factor which triggers
investment in data warehousing
appears to be competitive intensity.”
Kelly . Data Warehousing: The Route to Mass Customization. 1994: 14
MGMT 257-01- Katherine Watier
5
Data Warehouses Vs. Transactional Databases
Transactional System
System that tracks real-time data
Large queries bog down performance of system
Separate from other data systems
Fast (often near instant) queries
Hundreds of users access the system at one time
Time stamped with time of event
Event oriented
Volatile
Limited size usually within one hardware system
Usually requires custom training to use
Difficult to scale & add features
Current (Real-Time) Data
Data often created from one source
Highly normalized Data Structure
Due to stable transaction definitions, database design is relatively
stable
MGMT 257-01- Katherine Watier
6
Data Warehouses Vs. Transactional Databases
Data Warehousing System
System that monitors trends and other patterns
Available for large queries without bogging down daily functions
Integrated
Queries can take hours to days to complete
Limited number of users
Time-stamped with period of time
Subject oriented & Non-volatile
Systems often large and require parallel hardware structures
Easily accessible to users with limited training
Easy to scale & add features
Data is usually 5-10 years old
Data input from multiple sources in a variety of formats.
Often de-normalized
Constantly changing business requirements; design needs to be adaptive
MGMT 257-01- Katherine Watier
7
What Does a Data Warehouse look like?
End User
Data Access Tools
Data Warehouse
Sources
Integrated Data
Operational Data
Data Warehouse
Highly Summarized
Lightly Summarized
Detail Level
External Data
- Data mining
- DSS
- OLAP
Metadata
An architected environment optimized for
analytical and informational processing.
MGMT 257-01- Katherine Watier
8
Data warehouse Architecture
The pieces:
Online Transactional Processing OLTP/legacy
systems which provide the data
Transformation/propagation tools
Data warehouse database server
Metadata repository (data about the data)
Data marts for highly specified querying (usually
for a department)
Analysis/end-user tools (OLAP)
MGMT 257-01- Katherine Watier
9
The Data Flow
1. The Source Data
Data originates from a variety of sources:
Transaction system (registration application,
accounting application, etc.)
Legacy systems
Websites
External Market Research Data
MGMT 257-01- Katherine Watier
10
The Data Flow
2. Data Integration &Transformation
Data needs to be transformed into a
universal format before being imported.
The I&T layer focuses on:
–
–
–
–
–
Resolving key differences
Resequencing of the data
merging of the data
Summarizing the data
Creating data about the logic of the transformation that is stored
in the metadata repository.
MGMT 257-01- Katherine Watier
11
The Data Flow
3. Metadata Repository
Data about the data needs to be stored
to:
Help users analyze the data
Maintain a history about how data was
handled from its source to its storage
MGMT 257-01- Katherine Watier
12
“Star Schema” Database Design
Customer
Facts
Time dimension
Customer dimension
-------------------------------------------------------
Region dimension
Product dimension
Congress dimension
Five database tables linked to provide information about Customers.
MGMT 257-01- Katherine Watier
13
MGMT 257-01- Katherine Watier
14
Rules for Success in Building a Data
Warehouse
Define the business problem before you
start to build anything
Think architecture not databases
Begin small, grow incrementally
Develop one subject area first, deploy,
maintain and expand for the enterprise
Make sure there is close
collaboration between IT and the
end-users
MGMT 257-01- Katherine Watier
15
References:
IBM online. Retrieved from the World Wide Web October 23, 2001. http://www1.ibm.com/servers/eserver/iseries/db2/dataware.htm#header_4
Barquin, R. & Edelstein, H. 1997. Planning and Designing the Data Warehouse.
Prentice Hall PTR, Upper Saddle River, NJ.
Berson, A. & Smith, S. 1997. Data Warehousing, Data Minding, & OLAP. McGrawHill, New York, NY.
Hammergren, T. 1998. Data Warehousing on the Internet: Accessing the Corporate
Knowledge Base. International Thompson Computer Press: Boston, MA.
Inmon, W.H., Imhoff, C., & Sousa, R. 2001. Corporate Information Factory. Wiley
Computer Publishing: New York, NY.
Kelly, Sean. 1994. Data Warehousing: The Route to Mass Customization. John Wiley
& Sons: New York, NY.
Kelly, S. 1997. Data Warehousing in Action. John Wiley & Sons, Ltd. New York: NY.
Moeller, R.A. 2001. Distributed Data Warehousing Using Web Technology: How to
Build a More Cost-Effective and Flexible Warehouse. AMACOM: New York, NY.
Poe, Vidette. 1996. Building a Data Warehouse for Decision Support. Prentice Hall
PTR: Upper Saddle River, NJ.
Westerman, Paul. 2001. Data Warehousing: Using the Walmart Model. Morgan
Kaufmann Publishers: San Francisco, CA.
MGMT 257-01- Katherine Watier
16
Barquin International
Corporate Intranet
Project
MGMT 257-01- Katherine Watier
17
A little background…
Barquin International was established in 1994 by Dr. Ramon C.
Barquin.
Dr. Barquin was the founder, and the first president of The
Data Warehousing Institute.
BARQUIN’s mission:
to assist enterprises in extracting business meaning from their data and
developing competitive advantage through the use of data warehousing,
data mining, knowledge management and other business intelligence
disciplines.
We offer the following services:
Consulting
Audits
IT Design
IT Development
Education and
Training
Implementation
Strategies & Research
Benchmarking
MGMT 257-01- Katherine Watier
18
Business Case for the Project
Issue #1:
Barquin bids on a variety of projects and
depending on the project’s focus we often
need to hire consultants.
Over the past 4 years the company has
accumulated over 250 resumes which need to
be stored in such a way that we can search
them by applicant location and skill set.
MGMT 257-01- Katherine Watier
19
Business Case for the Project
Issue #2:
Company contact information which includes:
Consultant, corporate partner, and vendor
addresses and phone numbers are stored in
separate Outlook Contact lists or in the
managers’ Palm Pilots.
Every time someone needs a phone number,
they have to ask the other staff in the office if
they have that phone number, which results
in a wasting of staff time.
MGMT 257-01- Katherine Watier
20
Business Case for the Project
Issue #3:
Proposal Writing:
When creating proposals for technical
solutions, often we need to provide
summaries of the functionality of various data
warehousing products.
Currently, that information is re-created each
time or stored in a staff’s personal file
structure.
MGMT 257-01- Katherine Watier
21
The Solution
The intranet developed will allow for the
storage and querying of a variety of
information including:
People that are related to Barquin
– This includes: employees, applicants, consultants,
etc.
Companies that interact with Barquin
Products that we use
Resumes from prospective employees/consultants
MGMT 257-01- Katherine Watier
22
Status of the Project
Currently, most information can be added
inserted, deleted and viewed for all 4
categories.
MGMT 257-01- Katherine Watier
23
What Still Needs to be
Developed:
-Cosmetics
Various visual editing needs to be done.
-Password issues
The password needs to be hidden from view and
every page needs to be checked for the password
-Resume commenter
The resume input form needs to be pre-populating
the form with administrator's names
MGMT 257-01- Katherine Watier
24
What Didn’t Make it Past the
Cutting Room Floor
Originally intended to track:
1.
The data warehousing classes we provide
-who taught them, the materials they used, who took the classes.
2.
The projects we have finished
-what they were, who they were completed for, who was involved, the price of
the contract, summary information
3. A pop-up survey for our Internet site asking the
visitor knowledge management questions
4. Simple registration vehicle for our Internet site before
allowing users to download our white papers.
MGMT 257-01- Katherine Watier
25
Questions?
MGMT 257-01- Katherine Watier
26