Database Normalization

Download Report

Transcript Database Normalization

Database Normalization
TJ Racoosin
2 Dec 1998
CPCUG Access SIG
Racoosin Solutions
[email protected] 703 849-1997
Overview
•
•
•
•
•
•
•
Introductions
The Normal Forms
Primary Key
Relationships and Referential Integrity
When NOT to Normalize
Real World Exercise
Resources
Racoosin Solutions
Introductions
• TJ Racoosin
• You
– Are you familiar with normalization?
– Used the relationship window ? Enforce
referential integrity? Cascade Delete?
– Any issues with normalizing data?
Racoosin Solutions
Why Normalize?
• Flexibility
– Structure supports many ways to look at the data
• Data Integrity
– “Modification Anomalies”
• Deletion
• Insertion
• Update
• Efficiency
– Eliminate redundant data and save space
Racoosin Solutions
Normalization Defined
• “ In relational database design, the process of
organizing data to minimize duplication.
• Normalization usually involves dividing a
database into two or more tables and defining
relationships between the tables.
• The objective is to isolate data so that additions,
deletions, and modifications of a field can be made
in just one table and then propagated through the
rest of the database via the defined relationships.”
- Webopedia, http://webopedia.internet.com/TERM/n/normalization.html
Racoosin Solutions
Another Definition
• "Normalization" refers to the process of
creating an efficient, reliable, flexible,
and appropriate "relational" structure for
storing information. Normalized data
must be in a "relational" data structure.
- Reid Software Development, http://www.accessdatabase.com/normalize.html
Racoosin Solutions
The Normal Forms
• A series of logical steps to take to
normalize data tables
• First Normal Form
• Second
• Third
• Boyce Codd
• There’s more, but beyond scope of this
Racoosin Solutions
First Normal Form (1NF)
• All columns (fields) must be atomic
– Means : no repeating items in columns
OrderDate
11/30/1998
OrderDate
11/30/1998
Customer
Joe Smith
Customer
Joe Smith
Items
Hammer, Saw, Nails
Item1
Hammer
Item2
Saw
Item3
Nails
Solution: make a separate table for each set of
attributes with a primary key (parser, append query)
Customers
CustomerID
Name
Orders
OrderID
Item
CustomerID
OrderDate
Racoosin Solutions
Second Normal Form (2NF)
• In 1NF and every non-key column is fully
dependent on the (entire) primary key
– Means : Do(es) the key field(s) imply the rest of the fields? Do we
need to know both OrderID and Item to know the Customer and
Date? Clue: repeating fields
OrderID
1
1
1
Item
Hammer
Saw
Nails
CustomerID
1
1
1
OrderDate
11/30/1998
11/30/1998
11/30/1998
Solution: Remove to a separate table (Make Table)
Orders
OrderID
CustomerID
OrderDate
OrderDetails
OrderID
Item
Racoosin Solutions
Third Normal Form (3NF)
• In 2NF and every non-key column is mutually
independent
– means : Calculations
Item
Hammer
Saw
Nails
Quantity
2
5
8
Price
$10
$40
$1
Total
$20
$200
$8
•Solution: Put calculations in queries and forms
OrderDetails
OrderID
Item
Quantity
Price
Put expression in text control or in query:
=Quantity * Price
Racoosin Solutions
Kumar Madurai: http://www.mgt.buffalo.edu/courses/mgs/404/mfc/lecture4.ppt
Boyce-Codd Form (3NF) - Examples
• A more restricted version of 3NF (known as
Boyce-Codd Normal Form) requires that the
determinant of every functional dependency in
a relation be a key - for every FD: X => Y, X is
a key
• Consider the following relation:
STU-MAJ-ADV (Student-Id, Major, Advisor)
Advisor => Major, but Advisor is not a key
• Boyce-Codd Normal Form for above:
STU-ADV (Student-Id, Advisor)
ADV-MAJ (Advisor, Major)
2/16/98
MGS 404
10
Primary Key
• Unique Identifier for every row in the
table
– Integers vice Text to save memory, increase
speed
– Can be “composite”
– Surrogate is best bet!
• Meaningless, numeric column acting as
primary key in lieu of something like SSN or
phone number - (both can be reissued!)
Racoosin Solutions
Relationships
• One to many to enforce “Referential Integrity”
Two “foreign”
keys make a
composite primary
key and “relate”
many to many
tables
A look up table - it
doesn’t reference
any others
Racoosin Solutions
Table Prefixes Aid Development
– First, we’ll get replace text PK with number
– The Items table is a “look up” with tlkp prefix
– tlkp “lookup” table (no “foreign keys”)
– OrderDetails is renamed “trelOrderItem” a
“relational” table
• trel “relational” (or junction or linking)
– two foreign keys make a primary
OrderDetails
OrderID
Item
tlkpItems
ItemID
ItemName
trelOrderItem
OrderID
ItemID
tblOrders
OrderID
CustomerID
OrderDate
Racoosin Solutions
Referential Integrity
• Every piece of “foreign” key data has a
primary key on the one site of the relationship
– No “orphan” records. Every child has a parent
– Can’t delete records from primary table if in related table
• Benefits - Data Integrity and Propagation
– If update fields in main table, reflected in all queries
– Can’t add a record in related table without adding it to main
– Cascade Delete: If delete record from primary table, all
children deleted - use with care! Better idea to “archive”
– Cascade Update: If change the primary key field, will change
foreign key
Racoosin Solutions
When Not to Normalize
• Want to keep tables simple so user can make
their own queries
– Avoid processing multiple tables
• Archiving Records
– If No need to perform complex queries or “resurrect”
– Flatten and store in one or more tables
• Testing shows Normalization has poorer
performance
– “Sounds Like” field example
– Can also try temp tables produced from Make Table queries
Racoosin Solutions
Real World - School Data
Student
Last
Smith
Mills
Jones
Student
First
Renee
Lucy
Brendan
Street Address
5551 Private Hill
4902 Acme Ct
5304 Gains Street
Parent 1
Ann Jones
Barbara Mills
Jennifer Jones
Parent 2
Theodore Smith
Steve Mills
Stephen Jones
Previous Current
Teacher Teacher
Hamil
Burke
Hamil
Burke
Hamil
Burke
….
City
State
Annandale
Annandale
Fairfax
Postal Code
Virginia 22003Virginia 22003Virginia 22032-
Home Phone
(703) 323-0893
(703) 764-5829
(703) 978-1083
First Year Last Year
Age
Program Enrolled Attended Birthday inSept
PF
/0
0
6/25/93 5
PF
96/97
0
8/14/93 5
PH
96/97
0
6/13/94 4
Map Coord
22 A-3
21 F-3
21 A-4
Notes
Racoosin Solutions
….
One Possible Design
Racoosin Solutions
Books
• Access97 Developers Handbook Litwin,Getz & Gilbert
– Chapter 4
• Access and SQL Server Developers Handbook
Viescas, Gunderloy and Chipman
– Chapter 2
• Access97 Expert Solutions Lezynski
– Chapter 10
Racoosin Solutions
Internet
• Papers
• http://www.mtjeff.com/~calvin/devhbook/databasedesign.html
• http://www.swin.edu.au/infotech/subjects/bt220/bt220s1.html
• http://www.bus.okstate.edu/lhammer/AISweb/Normaliz.htm
http://www.inetspace.com/database.html
• Slides
• http://www.mgt.buffalo.edu/courses/mgs/404/mfc/lecture4.ppt
• http://www.state.sd.us/people/colink/datanorm.htm
• http://www.cba.nau.edu/morgan-j/class/subtop2_3/sld001.htm
Racoosin Solutions