INFS 6225 – Object-Oriented Systems Analysis & Design

Download Report

Transcript INFS 6225 – Object-Oriented Systems Analysis & Design

INFS 6220
Systems Analysis & Design
Transactional DBs vs.
Data Warehouses
Relational Databases (RDBMS)

Collection of linked tables

Tables linked by Primary Key / Foreign Key relationships
(Referential Integrity)

Primary Key – column whose values make each record
unique

Foreign Key – value in column that links to Primary Key in
another table

SQL – Structured Query Language (language to access data
in relational tables)
Relational DB Example
CUSTOMER TABLE
Cust #
100
101
102
Primary Key
Cust Name
Bob
Sue
Juan
ORDER TABLE
Order #
1
2
3
Prod#
QR22
QR22
SB56
Qty
1
25
3
Cust#
100
100
102
Foreign Key
Database Structure & Design

2 Approaches:
1. Optimize for
Data Capture
i.e., Capturing
Transactions
Conflict
2. Optimize for
Data Access
i.e., Queries
& Reporting
Approach #1: Optimize for Data
Capture

To optimize for data storage, you must:
•
•
•

Eliminate redundancy of data (or else wasted space &
processing occurs)
Ensure data integrity (or else data anomalies)
Ensure that changes in data (modifications, deletions, etc.
only have to happen in one place)
Normalization – process in which a DBMS is
optimized for data storage
•
•
All data “redundancy” is removed from Database
Has multiple forms (0, 1st, 2nd, 3rd, et al.)
Moving from 0NF to 1NF
Rule: Make a separate table for each set of related attributes, and give
each table a primary key of unique values.
CUSTOMER TABLE
ONF
Cust #
100, 100, 101
CustName
Bob, Sue, Juan
CUSTOMER TABLE
1NF
Cust #
100
101
102
Cust Name
Bob
Sue
Juan
Primary Key Created with Unique values
Moving from 1NF to 2NF
Rule: Eliminate any repeating values caused by a dependency on a
“keyed” column (i.e., either Primary or Foreign)
TABLE X
1NF
Cust #
100
100
101
Cust Name
Order#
Bob Dependency
1 on
Bob Primary Key
2
Sue
3
CUSTOMER TABLE
2NF
Cust # Cust Name
100
Bob
101
Sue
ORDER TABLE
Order #
1
2
3
Cust#
100
100
101
Moving from 2NF to 3NF
Rule: Eliminate any repeating values caused by a dependency on a
“non-keyed” column (i.e., dependency on ANY column)
TABLE X
2NF
Cust #
100
101
102
City
PGH
PGH
LA
Order#
1
2
3
ShipTime
2 days
2 days
5 days
SHIP TIME TABLE
3NF
City #
10
20
City
PGH
LA
ShipTime
2 days
5 days
Dependency b/t 2
non-key columns
CUSTOMER TABLE
Cust #
100
101
102
City#
10
10
20
Normalized DB Example
9
Approach #2: Optimize for Data
Access
(in a separate, read-only Data Warehouse)

To optimize for data access, you must:
• Allow data redundancy
• Reduce the number of table joins (links among
tables)

Denormalizing – Adding redundancy &
reducing joins in a DBMS
Denormalizing – Most Common Approach

Star Schema (Clustering)
•
•
Fact (core or transaction) Tables in middle of star
Dimensional (structural or “lookup”) Tables around “points” of star
REP TABLE
CUSTOMER TABLE
Cust #
100
101
102
CustName
Bob
Sue
Juan
DATE/TIME
Date
06/29/XX
06/30/XX
07/01/XX
Quarter
2
2
3
Rep #
1000
2000
3000
ORDER TABLE
Order #
1
2
3
Date
06/15/XX
07/19/XX
08/30/XX
Bob
Sue
Juan
Cust#
100
100
101
Prod#
QR22
QR22
SR56
Rep#
1000
1000
2000
RepName
Lee
James
Natasha
PRODUCT TABLE
Prod #
QR22
SR56
TW43
ProdName
Rake
Spade
Mulch
Denormalizing (continued)
• Stars are linked via common (i.e., Conformed) Dimensions to form Data
Warehouse
REP TABLE
CUSTOMER TABLE
Cust #
100
101
102
CustName
Bob
Sue
Juan
CUSTOMER TABLE
TIME
DATE/TIME
Date
06/29/XX
06/30/XX
07/01/XX
Quarter
2
2
3
Rep #
1000
2000
3000
ORDER
TABLE
ORDER TABLE
Order #
1
2
3
Date
06/15/XX
07/19/XX
08/30/XX
Cust#
100
100
101
Prod#
QR22
QR22
SR56
PRODUCT TABLE
Prod #
QR22
SR56
TW43
Bob
Sue
Juan
INVENTORY TABLE
Prod#
QR22
TW43
SR56
Rep#
1000
1000
2000
ProdName
Rake
Mulch
Spade
Stock Date Units
03/23/XX
150
04/15/XX
1452
05/01/XX
997
RepName
Lee
James
Natasha
ProdName
Rake
Spade
Mulch