Normalization

Download Report

Transcript Normalization

Process of Normalization
Produce a set of relations with
– minimal number of attributes to support organization’s data requirements
– attributes with a close logical relation described as functional dependency
are in same relation
– minimal redundancy with each attribute occurring only once with the
exception of foreign keys needed to join related relations
Database Design
A major aim of database design is to group attributes into
relations to minimize data redundancy which has the benefits
• updates to the database are achieved with a minimal
number of operations, reducing the potential for data
inconsistencies
• reduction in file storage space
Overview of Normalization Process
1. 1st Normal Form
•
No non-atomic values or repeating groups
2. 2nd Normal Form
•
No partial dependencies
3. 3rd Normal Form
•
No transitive dependencies
Functional Dependencies
Functional Dependency
 If A and B are two sets of attributes of a relation R, then A
functionally determines B if
 each value of A is associated with exactly one value of B
 Example – Relation R is address, A = {Zip}, B = {City,State}
Zip
City
State
38501
Cookeville
TN
39302
Nashville
TN
42020
Louisville
KY
• A  B imposes a integrity constraint on the database
• The left side is called the “determinant” of the dependency.
Branch
Staff
staffNo sName
Position
Salary Branch
Branch
Address
SL21
White
Mgr
30000
B005
B005
London
SG37
Beech
Assist
12000
B003
B007
Aberdeen
SG5
Brand
Mgr
24000
B003
b003
Glasgow
Partial Dependencies : a non-key attribute is functionally dependent on
part but not all of the primary key.
{staffNo, sName} {Branch} is a partial dependency since
determinant is not minimal; {staffNo}  {Branch} is a functional
(full) dependency
StaffBranch
staffNo
sName
Posiion
Salary
Branch
Address
SL21
White
Mgr
30000
B005
London
SG37
Beech
Assist
12000
B003
Aberdeen
SG5
Brand
Mgr
24000
B003
Glasgow
{staffNo}  {Branch} and {Branch}  {Address} is a transitive dependency
Identify all functional dependencies in the StaffBranch relation.
StaffBranch
staffNo
sName
Position
Salary
Branch
Address
SL21
White
Mgr
30000
B005
London
SG37
Beech
Assist
12000
B003
Aberdeen
SG5
Brand
Mgr
24000
B003
Glasgow
1.
{staffNo}  {sName, Position, Salary,Branch, Address}
2.
{Branch}  {Address}
3. {Address}  {Branch}
4. {Branch, Position}  {Salary}
5.
{Address, Position}  {Salary}
Functional Dependencies – Purchase Order Example
already in 1st normal form
PurchaseInfo = (poID,supID,supName,street,city,state,zipcode,movieID,
title,quantity, orderDate)
Partial Dependencies : a non-key attribute is functionally dependent on part but not all of the
primary key.
2nd Normal Form : A relation is in 2nd normal form if it is in 1st normal form and contains no
partial dependencies
Functional Dependencies :
{poID,movieID}  rest-of-attributes – primary key
{poID}  {suppID, suppName, street, city, state, zip,date} – partial dependency
{movieID}  {title,quantity} - partial dependency
poID
suppID
supName
street
city
state
zip
movieID
title
quantity
date
112
32
ABCM
Wash
Nash
TN
34212
1912
xyz
2
2/19/08
112
32
ABCM
Wash
Nash
TN
34212
3233
bam
1
2/19/08
112
32
ABCM
Wash
Nash
TN
34212
2312
nuts
3
2/19/08
1st to 2nd normal form
Conversion Technique
 Create new relation for each primary key attribute
or combination that is a determinant in a partial
dependency.
 The primary key attribute or combination is the
primary key in the new relation.
 Move the nonkey attributes which depend on the
primary key attribute from the old to the new
relation.
All relations in 2nd normal form
Movie Info
Purchase Order
movieID
title
quantity
poID
movieID
date
1912
xyz
2
112
32
2/19/08
3233
bam
1
2312
nuts
3
Supplier Info
poID
suppID
supName
street
city
state
zip
112
32
ABCM
Wash
Nash
TN
34212
2nd to 3rd Normal Form
3rd Normal Form : A relation is in 3rd normal form if it is in 2nd
normal form and has no transitive dependencies
A transitive dependency is a functional dependency between
two or more non-key attributes
The relation Supplier_Info(poID, supID,supName,street,city,state,zipcode)
has a transitive dependency.
{zipcode}  {city,state} is a transitive dependency since the
determinant {zipcode} is non-key as are {city,state}.
Note : {poID}  {zipcode} and {zipcode}  {city,state}
2nd to 3rd normal form
Conversion Technique
 For each nonkey attribute (or set of attributes)
that is a determinant of a transitive dependency in
the relation, create a new relation. That attribute
becomes the primary key of the relation.
 Move all the attributes that are functionally
dependent on the determinant to the new relation.
 Leave the determinant attribute (or set of
attributes) in the old relation as a foreign key to the
new relation.
Movie Info
All relations in 3nd normal form
Purchase Order
poID
movieID
112
32
movieID
title
quantity
1912
xyz
2
3233
bam
1
2312
nuts
3
date
2/19/08
SupplierAddress
Supplier Info
poID
suppID
supName
street
zip
city
state
zip
112
32
ABCM
Wash
34212
Nash
TN
34212
SQL examples
DDL : Create Table <TableName>(<AttributeList>)
Create Table PurchaseOrder(poID integer, movieID integer, odate date)
Design View
Data View
SQL
examples
continued
SQL examples continued
SELECT suppInfo.supName, movieInfo.title
FROM (suppAdress INNER JOIN (PurchaseOrder INNER JOIN
suppInfo ON PurchaseOrder.poID = suppInfo.poID) ON
suppAdress.Zip = suppInfo.zip) INNER JOIN movieInfo ON
PurchaseOrder.movieID = movieInfo.movieID
WHERE (((PurchaseOrder.poID)=112) AND
((movieInfo.movieID)=32));
Identifying Functional Dependencies
Using sample data that is representative of all
possible data values that relation / table might hold.
A
B
C
C
E
a
b
z
w
q
e
b
r
w
p
a
d
z
w
f
e
d
f
w
q
a
f
z
s
f
e
f
f
s
t
Functional Dependencies : fd1 : A  C, fd1 : C  A,
fd3 : B  D, fd4 : {A,B}  E
Reasons for Identifying Functional Dependencies
• Main reason is to identify integrity constraints that must hold on a
relation in a database.
• Identifying candidate keys :
• Example – StaffBranch relation
• staffNo  sName, position, salary, branch, address
• branch  address
• address  branch
• branch, position  salary
• address, position  salary
• It is clear that staffNo is the only candidate key and so would be
chosen as the primary key if this relation were to be part of the
schema of the database.
Process of Normalization
Example : Tenant rents property only once and not more than one at a time.
TenantRental (clientNo, cName, propNo, address, rStart, rStop, rent, ownNo, ownName)
If the same tenant has rented several units, there will be multiple values (repeating groups)
in rows corresponding to that tenant.
clientNo
cName
propNo
address
rStart
rStop
rent
ownNo ownName
T051
Brown
P32
P81
Oak
Maple
1/1/07
9/1/08
6/1/07
550
600
O03
O07
Carter
Biggs
T067
Dodd
P81
P96
Maple
Cherry
2/1/07
12/1/07
11/31/07
600
525
O07
O04
Biggs
Stevens
Conversion to 1st normal form
Conversion Process
Method 1 : Enter appropriate data in empty columns for repeating data
1st Normal Form (atomic values in each column)
clientNo
cName
propNo
address
rStart
rStop
rent
ownNo ownName
T051
T051
Brown
Brown
P32
P81
Oak
Maple
1/1/07
9/1/08
6/1/07
550
600
O03
O07
Carter
Biggs
T067
T067
Dodd
Dodd
P81
P96
Maple
Cherry
2/1/07
12/1/07
11/31/07
600
525
O07
O04
Biggs
Stevens
Method 2 : Separate repeating data with a copy of key into separate relation or table.
1st Normal Form (atomic values in each column) Client and PropertyRentalOwner relations:
clientNo
cName
clientNo
propNo
address
rStart
rStop
rent
ownNo
ownName
T051
Brown
T051
T051
P32
P81
Oak
Maple
1/1/07
9/1/08
6/1/07
null
550
600
O03
O07
Carter
Biggs
T067
Dodd
T067
T067
P81
P96
Maple
Cherry
2/1/07
12/1/07
11/31/07
null
600
525
O07
O04
Biggs
Stevens
Functional Dependencies in 1st Normal Form
The PropertyRentalOwner relation is (clientNo, propNo, address, rStart, rStop, rent,
ownNo, ownName) and has the following functional dependencies :
1.
clientNo, propNo  address, rStart, rStop rent, ownNo, ownName (primary key)
2.
propNo  address, rent, ownNo, ownName
2nd Normal Form : every non-primary key value is fully functionally dependent on
the primary key – no non-key attribute is partially dependent on primary key
Functional dependency #2 shows that {address, rent, ownNo, ownName} is partially
dependent on the primary key – e.g., clientNo can be removed from determinant
Remove {address, rent, ownNo, ownName} along with the determinant propNo and
create a new relation
PropertyOwner(propNo, address, rent, ownNo, ownName) which, with the
client relation Client(clientNo,cName) and Rental(clientNo,propNo,rStart,rStop)
relations are in 2nd normal form.
2nd to 3rd Normal Form
• PropertyOwner(propNo, address, rent, ownNo, ownName)
• Client(clientNo,cName)
• Rental(clientNo,propNo,rStart,rStop)
• are in 2nd normal form
propNo  ownNo and ownNo  ownName
form a transitive dependency.
3rd Normal Form : 2nd normal form and no transitive
dependencies.
• Owner(ownNo,ownName)
•PropertyOwner(propNo, address, rent, ownNo*)
• Client(clientNo,cName)
• Rental(clientNo,propNo,rStart,rStop)
• are in 3rd normal form