Data Warehouses

Download Report

Transcript Data Warehouses

Data Warehouses
Presented by Patrick Seto
CS157A Section 3
Patrick Seto CS157A Section 3
Topics





What Is A Data Warehouse?
History
Characteristics
Operational Database vs. Data
Warehouse
Architecture
Patrick Seto CS157A Section 3
What Is A Data Warehouse?

The term "data warehouse" refers to a
special type of database that acts as
the central repository for company data.
It can be thought of as a database
archive that is segregated from the
operational databases, and used
primarily for reporting and data mining
purposes.
Patrick Seto CS157A Section 3
History

Data warehouses were first developed
in the 1980s in response to the growing
demand for management information
analysis, which operational databases
could not perform without drastically
affecting response time.
Patrick Seto CS157A Section 3
History

Off line Operational Databases
Data warehouses in this initial stage are exact copies of
operational databases that have been copied to off-line servers
where the processing load of reporting does not impact on the
operational system's performance.

Off line Data Warehouse
Data warehouses in this stage of evolution are updated
regularly at a specified time interval (e.g., daily, weekly,
monthly, etc.) from the operational systems, and the data is
stored in an integrated reporting-oriented data structure.
Patrick Seto CS157A Section 3
History

Real Time Data Warehouse
Data warehouses at this stage are updated each time an
operational system performs a transaction.

Integrated Data Warehouse
Data warehouses at this stage are used to generate activity or
transactions that are passed back into the operational systems
for use in the daily activity of the organization.
Patrick Seto CS157A Section 3
Data Warehouse Characteristics

Subject-oriented
The data in the database is organized so that all the data
elements relating to the same real-world event or object are
linked together.

Time-variant
The changes to the data in the database are tracked and
recorded so that reports can be produced showing changes over
time.
Patrick Seto CS157A Section 3
Data Warehouse Characteristics

Non-volatile
Data in the database is never over-written or deleted. Once
committed, the data is static, read-only, but retained for future
reporting.

Integrated
The database contains data from most or all of an organization's
operational applications, and that this data is made consistent.
Patrick Seto CS157A Section 3
Operational Database vs. Data
Warehouse


The processing load of reporting
reduced the response time of the
operational systems.
The database designs of operational
systems were not optimized for
information analysis and reporting.
Patrick Seto CS157A Section 3
Operational Database vs. Data
Warehouse


Most organizations had more than one
operational system, so company-wide
reporting could not be supported from a
single system.
Development of reports in operational
systems often required writing specific
computer programs which was slow and
expensive.
Patrick Seto CS157A Section 3
Operational Database vs. Data
Warehouse



Consolidation of data from a wide
variety of data sources.
Ability to analyze data beyond the level
of standard monitoring reports.
Operational response time unaffected.
Patrick Seto CS157A Section 3
Architecture
Patrick Seto CS157A Section 3
Architecture
Patrick Seto CS157A Section 3
References

http://en.wikipedia.org/wiki/Data\_warehouse

http://www.informationbuilders.com/data-warehousing.html

http://www.stsc.hill.af.mil/crosstalk/1996/10/xt96d10b.asp

http://www.dwreview.com/DW\_Overview.html

http://www.deakin.edu.au/ddw/what-is.php
Patrick Seto CS157A Section 3