SOM485CH5CLASSSLIDES
Download
Report
Transcript SOM485CH5CLASSSLIDES
Chapter 5
Study Sections 5.2, 5.3, 5.5, Pages:231-233 & Snowflake schema
DATA
WAREHOUSING
Business Intelligence
• Companies collect a large amount of data from their business
operations.
• To keep track of that information, a business uses disparate
software applications , such as Excel, Access, etc.
• Using multiple software makes it difficult to retrieve information in a
timely manner and to perform analysis of the data.
• Business Intelligence (BI) represents the tools and systems that
play a key role in integrating and analyzing all corporate data.
•
• Generally illustrates intelligence in the areas of customer profiling,
market research, product profitability (by product, region, year),
etc.
BI Architecture
Consists of 3 system components
– Data warehouse
– Business analytics
– Performance management (BPM)
Data warehouse
– A repository of cleaned and integrated historical
/stable data for the entire business
– Extracted from independent databases (internal &
external)
– Transformed (ie. cleaned and reformatted)
- A subset of a warehouse limited to a business
function is called a Data Mart (eg. Sales).
Business analytics
These are tools that help analyze the data towards
finding solutions:
– Reporting and queries
• Multi-dimensional reports, eg. Pivot tables [see Exercise 8];
• SQL Queries [Exercise 9]
• Cube analysis [Chapter 6]
– Data, text and Web mining and other sophisticated
mathematical and statistical tools for searching
relationships [Chapter 7]
Business Performance Management (BPM)
BPM supports monitoring, measuring, and comparing of sales,
profit, cost, profitability, and other performance indicators
• Dashboard reports
• Production reports
Data Source
Views
Data Source
= DW
Analytic Tools
SQL, Cubes
Transaction
Data Systems
Extraction, Transformation,
and Load (ETL) Process
A data warehousing process consists of :
• Extraction (i.e., reading data from a database),
• Transformation (i.e., converting the extracted data from its
previous form into the form in which it needs to be so that it
can be placed into a data warehouse), and
•
Load (i.e., storing the data into the data warehouse)
Data Integration and the
Extraction, Transformation,
and Load (ETL) Process
ETL
Newly integrated schema
for the Data Warehouse
Data from multiple Sources
DW Schema Structures: Star
Note that data is un-normalized
DW Schema Structures: Snowflake
Note that data is normalized
Designing Fact Tables: Normalization
• Normalization is the process of gathering attributes
into tables to eliminate redundant data (the redundancy
here is EquipID EquipType)
Normalization Exercise
•
FIRST (Supplier#, City, CityCode, Part#, Qty)
• Split the table into 3 different tables:
– (Supplier#, City)
– (City, CityCode)
– (Supplier#, Part#,Qty)
Although normalized databases have less data
redundancies, they are less efficient in quickly processing
the data.
Hence, many DWs use Star schema.
Data Marts provide ‘views’ of the data in the Data Warehouse
(we will be working with this in our SQL exercises)
Summary