Normalization

Download Report

Transcript Normalization

IT420: Database Management and
Organization
Normalization
31 January 2006
Adina Crăiniceanu
www.cs.usna.edu/~adina
Previously on IT420
 Database design
 Entity-Relationship (ER) Model
 Relational Model
 Transform ER to relational model
 Database Implementation
 SQL
Kroenke, Database Processing
Goal
 Refresh ER to Relational
 Normalization
Kroenke, Database Processing
ER to Relational
 ER:
 Entities
 identifiers
 Relationships
 cardinality
 Relational model
 Tables
 Constraints
Kroenke, Database Processing
ER to Relational
 Transform entities in tables
 Transform relationships using foreign keys
 Specify logic for enforcing minimum
cardinalities
Kroenke, Database Processing
Class Exercise: Transform ER
model into Relational Model
CLUB
FAALicense
AIRCRAFT
FAA number
Address
Phone
Min Cardinality:1
ModelNumber
Color
MEMBER
Name
Phone
Rating
TotalHours
FLIGHT
FlightID
RentalDate
ReturnDate
TimeFlown
Kroenke, Database Processing
OWNER
Name
Phone
Min Cardinality:1 Address
Outline
 ER to Relational
 Normalization
Kroenke, Database Processing
Premise
 We have received one or more tables with
data
 The data is to be stored in a new database
 QUESTION: Should the data be stored as
received, or should it be transformed for
storage?
Kroenke, Database Processing
Data Redundancy
Number LastName FirstName
Email
Rating
Wage
190
Smith
John
[email protected]
4
25
673
Doe
Jane
[email protected]
7
35
312
Doe
Bob
[email protected]
8
40
152
Johnson
Matt
[email protected]
7
35
Application constraint:
All employees with same rating have the same wage (Rating Wage)
Problems due to data redundancy?
Kroenke, Database Processing
Modification Anomalies
 Deletion Anomaly: What if we delete all
employees with rating 8?
 Insertion Anomaly: What if we have a new
employee with new rating 12?
 Update Anomaly: What if we change the
wage for rating 7 to be 27?
Kroenke, Database Processing
Update Anomalies
 The EMPLOYEE table before and after an incorrect
update operation on Wage for Rating = 7
Number LastName FirstName
Email
Rating
Wage
190
Smith
John
[email protected]
4
25
673
Doe
Jane
[email protected]
7
35
312
Doe
Bob
[email protected]
8
40
152
Johnson
Matt
[email protected]
7
35
FirstName
Email
Rating
Wage
Number LastName
190
Smith
John
[email protected]
4
25
673
Doe
Jane
[email protected]
7
37
312
Doe
Bob
[email protected]
8
40
152
Johnson
Matt
[email protected]
7
35
Kroenke, Database Processing
Table decomposition
Number LastName
FirstName
Email
Rating
Wage
190
Smith
John
[email protected]
4
25
673
Doe
Jane
[email protected]
7
35
312
Doe
Bob
[email protected]
8
40
152
Johnson
Matt
[email protected]
7
35
Number LastName FirstName
Email
Rating
Rating
Wage
4
25
190
Smith
John
[email protected]
4
673
Doe
Jane
[email protected]
7
7
35
312
Doe
Bob
[email protected]
8
8
40
152
Johnson
Matt
[email protected]
7
Problem?
Kroenke, Database Processing
Decisions
 Do we have to decompose / merge?
 How do we identify problems caused by
redundancy?
 Functional dependencies
Kroenke, Database Processing
Functional Dependency (FD)
 A functional dependency: the value of one (a set of)
attribute(s) determines the value of a second (set of)
attribute(s):
Alpha  MIDNName
Alpha  (DormName, DormRoom)
 The attribute on the left side of the functional
dependency is called the determinant
 Functional dependencies may be based on equations:
Charge = NbHours X HourlyPrice
(NbHours, HourlyPrice)Charge
 Function dependencies are not equations!
Kroenke, Database Processing
Functional Dependencies Are Not Equations
ObjectColor  Weight
ObjectColor  Shape
ObjectColor  (Weight, Shape)
Kroenke, Database Processing
Composite Determinants
 Composite determinant: A determinant
of a functional dependency that consists of
more than one attribute
 (StudentName, ClassName)  Grade
 (NbHours, HourlyPrice)  Charge
Kroenke, Database Processing
Functional Dependency (FD) Rules
 If A  (B, C), then A  B and A C
 If (A,B)  C, then neither A nor B
determines C by itself
Kroenke, Database Processing
FD Facts
 A functional dependency is a statement about all
allowable instances of a table
 You cannot find the functional dependencies
simply by looking at some data:
 Data set limitations
 Must be logically a determinant
 Given some data in a table R, we can check if it
violates some FD, but we cannot tell if the FD
holds over R!
Kroenke, Database Processing
Functional Dependencies in the
SKU_DATA Table
Assuming data is representative, determine the FD
Kroenke, Database Processing
Functional Dependencies in the
SKU_DATA Table
SKU  (SKU_Description, Department, Buyer)
SKU_Description  (SKU, Department, Buyer)
Buyer  Department
Kroenke, Database Processing
Class Exercise
Assuming all FD that are verified by the data are indeed FD, determine the FD
in the ORDER_ITEM Table
Kroenke, Database Processing
What Makes Determinant Values Unique?
 A determinant is unique in a relation if, and
only if, it determines every other column in
the relation
 Unique determinants = superkey
Kroenke, Database Processing
Key
 A set of columns is a key for a relation if :
1. No two distinct rows can have same
values in all key columns, and
2. This is not true for any subset of the key
 Part 2 false? A superkey
 Primary key
 Alternate key
Kroenke, Database Processing
Normal Forms
 Relations are categorized as a normal form
based on which modification anomalies or other
problems that they are subject to:
Kroenke, Database Processing
Normal Forms
 Boyce-Codd Normal Form (BCNF) – A
relation is in BCNF if every determinant is
a candidate key
“I swear to construct my tables so that
all nonkey columns are dependent on
the key, the whole key and nothing but
the key, so help me Codd.”
Kroenke, Database Processing
Eliminating Modification Anomalies from
Functional Dependencies in Relations
 Put all relations into Boyce-Codd Normal Form
(BCNF):
Kroenke, Database Processing
Putting a Relation into BCNF:
EQUIPMENT_REPAIR
Kroenke, Database Processing
Putting a Relation into BCNF:
EQUIPMENT_REPAIR
EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost,
RepairNumber, RepairDate, RepairAmount)
ItemNumber  (Type, AcquisitionCost)
RepairNumber  (ItemNumber, Type, AcquisitionCost,
RepairDate, RepairAmount)
ITEM
(ItemNumber, Type, AcquisitionCost)
REPAIR (ItemNumber, RepairNumber, RepairDate, RepairAmount)
Where REPAIR.ItemNumber must exist in
ITEM.ItemNumber
Kroenke, Database Processing
Putting a Relation into BCNF:
New Relations
Kroenke, Database Processing
Class Exercise
Put the following relation into BCNF
Kroenke, Database Processing
Multivalued Dependencies
 A multivalued dependency occurs when
a determinant determines a particular set
of values:
Employee  Degree
Employee  Sibling
PartKit  Part
 The determinant of a multivalued
dependency can never be a primary key
Kroenke, Database Processing
Multivalued Dependencies
Kroenke, Database Processing
Eliminating Anomolies from
Multivaled Dependencies
 Multivalued dependencies are not a
problem if they are in a separate relation,
so:
 Always put multivalued dependencies into
their own relation
 This is known as Fourth Normal Form (4NF)
Kroenke, Database Processing