Adamson & Venerable Chapter 2

Download Report

Transcript Adamson & Venerable Chapter 2

Adamson & Venerable
Chapter 2 &
working out a Homework 5/6
Solution
Transforming
Relational Databases into Dimensional
Diagrams
Spring 2012
Dimensional modeling in Sales
• In a DW designed to analyze SALES data,
important component of a dimensional
model is the Product Dimension.
• Product dimension includes important
characteristics used to differentiate the
product in the marketplace, called
Discriminators.
Key Business Term: Discriminators
• Discriminators: Descriptive characteristics of a
product that further describe it and are relevant
to purchasing decisions. Tracking discriminators
allows the business analyst to monitor
performance of various product styles,
influencing production and marketing plans.
• Discriminators for a men’s suit: Cloth, color,
style/cut, weight, size
• Discriminators for vehicles: Model name, model
styling package, line, category, exterior color,
model year, interior color
Other Dimensions in Sales
• Time dimension: Time key, month, day, date, year, day
of week, quarter.
• Customer_Demographic dimension: This does not
require a row for each customer, but groups customers
by different combinations of age, gender, income, and
geography. The degree of demographic segmentation
varies by industry.
• Dealer dimension: Data on dealer performance are
needed, to support decisions on which dealers should be
eased out of business.
• Method_Of_Payment dimension (lease, financing
options, etc.)
Fact Table: Storing derived facts
• A commonly used derived fact should be
stored, and not calculated in reports and
queries. Cutting such “redundant” key
measures from the fact table results in the
following:
1. Development of reports gets more complex
2. Increased potential for errors in reports
3. Increased documentation requirements
4. A Hundred dollars’ worth of disk space is
saved (40 MB of space savings for a 10-million
row fact table)
Transformation Stages
(Key for Homework #5/6)
• De-normalization Process
– Start with Normalized Tables
– Determine Dimensions and Fact Tables
– Delete Relationships
• Rebuild Tables
• Rebuild Relationship Diagram as Star
Diagram, a.k.a., Dimension Table
• Provides information needed to complete
Homework 6, too!
Premiere Products ERD
Redrawn to form most
likely Star Diagram
Delete Relationships
Rebuild Tables
• Using copied operations database
• Be sure all ops. data is saved and backed up …
multiple times.
• Data staging & cleansing
– Denormalize extra relationships
•
•
•
•
–
–
–
–
–
Order:OrderLine;
Customer:SalesRep;
Order:OrderLine  OrderDetail
Customer:SalesRep  OrderDetail
Transform data for new tables in Access: Make Table
Export data files, if needed, to rebuild elsewhere (Excel)
Additional Transformations as needed in Excel
Create the Time dimension
Re-Import data files to new tables
Order_OrderLine Query
Restructured Data -- Stage 1
Preparing Rep:Customer
Join to Order_OrderLine
OrderDetail Query
Order of Columns
Not critical
Sort order not critical but
Good time to revise
Make-Table OrderDetail Query
OrderDetail Table
Set Primary Keys
Data Cleansing
Fix dates
Add Indexes
Build Star Diagram
Fact Table
What’s Missing?
Time Dimension
Use SQL to eliminate
Redundant dates
Built-in Functions
Built-in Date/Time Functions
Excel Time Table
Create Time table using
Excel Formulas:
1. Fill
2. Copy/paste
3. Compute Quarter
4. Conversion Formulas
5. Etc.
Import Time Table to Access
Time Table w/Indexes
Premier Products Star Diagram
PP -- Relation List
Fact Table
• OrderDetail[OrderNum, PartNum, OrderDate, CustNo,
RepNo, NumOrdered, Price]
Dimension Tables
• Customer[CustNum, CustName, Street, City, State, Zip,
Balance, CreditLimit]
• Rep[RepNum, LastName, FirstName, Street, City, State,
Zip, Commission, Rate]
• Part[PartNum, Desc, OnHand, Class, Warehouse, Price]
• Time[TimeKey, Day_of_Week, Month, Year, JulianDate,
Quarter, etc.]
Replacing OrderDate with
Time_key
Tools/Analyze/Tables
Tools/Analyze/Tables
Tools/Analyze/Tables
Tools/Analyze/Documenter
Tools/Analyze/Documenter
CUSTOMER table
http://www.webopedia.com
GUID Last modified: Thursday, June 20, 2002
• Short for Globally Unique Identifier, a unique 128-bit number that is
produced by the Windows OS or by some Windows applications to
identify a particular component, application, file, database entry,
and/or user. For instance, a Web site may generate a GUID and
assign it to a user's browser to record and track the session. A GUID
is also used in a Windows registry to identify COM DLLs. Knowing
where to look in the registry and having the correct GUID yields a lot
information about a COM object (i.e., information in the type library,
its physical location, etc.). Windows also identifies user accounts by a
username (computer/domain and username) and assigns it a GUID.
Some database administrators even will use GUIDs as primary key
values in databases.
• GUIDs can be created in a number of ways, but usually they are a
combination of a few unique settings based on specific point in time
(e.g., an IP address, network MAC address, clock date/time, etc.).
Tools/Analyze/Documenter
OrderDetail table
Henry Books ERD (before)
BookInventory
Henry Books ERD (before)
Update Query for Inventory Table
[Physical Inventory Date]
Inventory Table: Date Updated
BookInventory Fact Table
Note: 2 Time Keys [Pick one]
BookInventory Fact Table
Note: 2 Time Keys [Pick one]
BookInventory Fact Table
Note: 2 Time Keys [Pick one]
BookInventory Fact Table
Note: 2 Time Keys [Pick one]
Henry Books Star Diagram
HB -- Relation List
Fact Table
• BookTrans[BookCode, AuthorNum, BranchNum,
Sequence, PubCode, OnHand, InventoryDate]
Dimension Tables
• Book[BookCode, Title, Type, Price, Paperback]
• Publisher[PubCode, PubName, City, …State, Zip]
• Author[AuthorNum, LastName, FirstName, …]
• Branch[BranchNum, BranchName, Location, Employees]
• Time[TimeKey, Day_of_Week, Month, Year, JulianDate,
Quarter, etc.]
Summary
• Complete Transformations
• How normal are the resulting tables?
– 1NF, 2NF, 3NF?
• Document Transformation maps
• Prepare for Appending Tables with new
data
• Tools/Analyze/Documenter (next time)