Slides from Lecture 6 - Courses - University of California, Berkeley
Download
Report
Transcript Slides from Lecture 6 - Courses - University of California, Berkeley
Database Design: Normalization
and Access DB Creation
University of California, Berkeley
School of Information Management
and Systems
SIMS 257: Database Management
IS 257 – Spring 2004
2004.02.05 - SLIDE 1
Lecture Outline
• Review
– Database Design -- Object-Oriented
Modeling
• Logical Design for the Diveshop
database
• Normalization
• Access Database Creation
IS 257 – Spring 2004
2004.02.05 - SLIDE 2
Lecture Outline
• Review
– Database Design -- Object-Oriented
Modeling
• Logical Design for the Diveshop
database
• Normalization
• Access Database Creation
IS 257 – Spring 2004
2004.02.05 - SLIDE 3
Database Design Process
Application 1
External
Model
Application 2
Application 3
Application 4
External
Model
External
Model
External
Model
Application 1
Conceptual
requirements
Application 2
Conceptual
requirements
Application 3
Conceptual
requirements
Conceptual
Model
Logical
Model
Internal
Model
Application 4
Conceptual
requirements
IS 257 – Spring 2004
2004.02.05 - SLIDE 4
Object-Oriented Modeling
• Becoming increasingly important as
– Object-Oriented and Object-Relational DBMS
continue to proliferate
– Databases become more complex and have
more complex relationships than are easily
captured in ER or EER diagrams
• (Most UML examples based on McFadden, “Modern
Database Management”, 5th edition)
IS 257 – Spring 2004
2004.02.05 - SLIDE 5
Class Diagrams
• A class diagram is a diagram that shows a
set of classes, interfaces, and/or
collaborations and the relationships
among these elements.
IS 257 – Spring 2004
2004.02.05 - SLIDE 6
UML Class Diagram
DIVEORDS
Order No
Customer No
Sale Date
Shipvia
PaymentMethod
CCNumber
No of People
Depart Date
Return Date
Destination
Vacation Cost
CalcTotalInvoice()
CalcEquipment()
IS 257 – Spring 2004
Class Name
List of Attributes
List of operations
2004.02.05 - SLIDE 7
Associations: Unary relationships
*
0..1
Person
0..1
IS 257 – Spring 2004
Is-married-to
manages
Employee
0..1 manager
2004.02.05 - SLIDE 8
Associations: Binary Relationship
Employee
0..1
Is-assigned
Parking
Place
0..1
One-to-one
Product
Line
1
contains
*
Product
One-to-many
Student
*
Registers-for
*
Course
Many-to-many
IS 257 – Spring 2004
2004.02.05 - SLIDE 9
Associations: Ternary Relationships
Part
*
Vendor
IS 257 – Spring 2004
*
Supplies
* Warehouse
2004.02.05 - SLIDE 10
Association Classes
Registers-for
Student
*
Course
*
Computer Account
Registration
_________________
________________
acctID
Term
issues
Password
*
0..1
Grade
ServerSpace
________________
CheckEligibility()
IS 257 – Spring 2004
2004.02.05 - SLIDE 11
Derived Attributes, Associations, and
Roles
Course
Student
Course
Offering
_________
____________
____________ Scheduled-for
name
Registers-for
crseCode
term
ssn
*
crseTitle
*
*
1
section
dateOfBirth
creditHrs
time
Derived
/age
location
attribute
*
*
/participant Derived role
{age = currentDate – dateOfBirth}
/Takes
Derived association
IS 257 – Spring 2004
2004.02.05 - SLIDE 12
Generalization
Employee
____________
empName
empNumber
address
dateHired
____________
printLabel()
Hourly Employee
_______________
HourlyRate
_______________
computeWages()
IS 257 – Spring 2004
Salaried Employee
_______________
Annual Sal
stockoption
_______________
Contributepension()
Consultant
_______________
contractNumber
billingRate
_______________
computeFees()
2004.02.05 - SLIDE 13
Other Diagramming methods
• SOM (Semantic Object Model)
• Object Definition Language (ODL)
– Not really diagramming
• Access relationships display
• Hybrids
IS 257 – Spring 2004
2004.02.05 - SLIDE 14
Lecture Outline
• Review
– Database Design -- Object-Oriented
Modeling
• Logical Design for the Diveshop
database
• Normalization
• Access Database Creation
IS 257 – Spring 2004
2004.02.05 - SLIDE 15
Database Design Process
Application 1
External
Model
Application 2
Application 3
Application 4
External
Model
External
Model
External
Model
Application 1
Conceptual
requirements
Application 2
Conceptual
requirements
Application 3
Conceptual
requirements
Conceptual
Model
Logical
Model
Internal
Model
Application 4
Conceptual
requirements
IS 257 – Spring 2004
2004.02.05 - SLIDE 16
DiveShop ER Diagram
Customer
No
DiveCust
1
Destination
Name
Destination
no
Customer
No
ShipVia
n
Dest
n
1
DiveOrds
n
1
ShipVia
ShipVia
1
Destination
no
Site No
1
n
Site No
BioSite
Species
No
1
Destination
n
Sites
Order
No
n
1
1/n
ShipWrck
Order
No
DiveItem
n
Item
No
n
Site No
1
Species
No
BioLife
IS 257 – Spring 2004
1
DiveStok
Item
No
2004.02.05 - SLIDE 17
Logical Design: Mapping to a Relational
Model
• Each entity in the ER Diagram becomes a
relation.
• A properly normalized ER diagram will indicate
where intersection relations for many-to-many
mappings are needed.
• Relationships are indicated by common columns
(or domains) in tables that are related.
• We will examine the tables for the Diveshop
derived from the ER diagram
IS 257 – Spring 2004
2004.02.05 - SLIDE 18
Customer = DIVECUST
Customer No
Name
Street
City
State/Prov Zip/Postal Code
Country
1480 Louis Jazdzewski
2501 O'Connor
New Orleans
LA
60332
U.S.A.
1481 Barbara Wright
6344 W. Freeway
San Francisco
CA
95031
U.S.A.
1909 Stephen Bredenburg
559 N.E. 167
Indianapolis
Place IN
46241
U.S.A.
1913 Phillip Davoust
123 First Street
Berkeley CA
94704
U.S.A.
1969 David Burgett
320 Montgomery
SeattleStreet
WA
98105
U.S.A.
2001 Mary Rioux1701 Gateway
Pueblo
Blvd. #385
CO
81002
U.S.A.
2306 Kim Lopez 14134 Nottingham
HonoluluLane
HI
96826
U.S.A.
2589 Hiram Marley
7233 Mill Run
SanDrive
Francisco
CA
94123
U.S.A.
3154 Tanya Kulesa
505 S. Flower,
NewMail
YorkStop
NY 48943 10032
U.S.A.
3333 Charles Sekaron
110 East Park
Miller
Avenue,SD
Box 8
57362
U.S.A.
3684 Lowell Lutz915 E. Fesler
Dallas
TX
75043
U.S.A.
4158 Keith Lucas56 South Euclid
Chicago IL
60542
U.S.A.
4175 Karen Ng 2134 ElmhillKlamath
Pike Falls
OR
97603
U.S.A.
5510 Ken Soule 58 Sansome
Aurora
Street CO
89022
U.S.A.
IS 257 – Spring 2004
Phone
First Contact
(902) 555-88881/29/95
(415) 555-43212/2/93
(317) 555-36441/5/93
(415) 555-91843/9/98
(206) 555-75803/12/99
(719) 555-20103/15/97
(808) 555-50501/29/99
(415) 555-64302/18/99
(212) 555-67501/30/99
(613) 555-43333/16/98
(214) 555-27222/15/99
(312) 555-43103/17/98
(503) 555-47003/20/99
(303) 555-66952/5/99
2004.02.05 - SLIDE 19
Dive Order = DIVEORDS
Order No Customer No
Sale Date
307
1480
9/1/99
310
1481
9/1/99
313
1909
9/1/99
314
1913
9/1/99
317
1969
9/1/99
320
2001
9/1/99
321
2306
9/1/99
325
2589
9/1/99
326
3333
9/1/99
327
3684
9/1/99
329
4158
9/1/99
330
4175
9/1/99
331
5510
9/1/99
333
5926
9/1/99
336
5719
9/1/99
IS 257 – Spring 2004
Ship Via
UPS
FedEx
Walk In
FedEx
FedEx
Walk In
Emery
Emery
FedEx
DHL
Walk In
FedEx
FedEx
DHL
FedEx
PaymentMethod
CcNumber CcExpDateNo Of People
Depart DateReturn DateDestinationVacationCost
Visa
12345 678 90 1/1/01
2 11/8/00 11/15/00 Fiji
10000
Check
1
4/4/00
4/18/00 Santa Barbara 6000
Visa
456456456 9/11/00
4 6/27/00
7/11/00 Cozumel
8000
Check
3
2/7/00
2/14/00 Monterey
6000
AmEx
432432432 12/31/02
4
5/9/00
5/16/00 Fiji
20000
Cash
1 10/10/00 10/17/00 Santa Barbara 3000
Master Card
1112223334 8/12/00
1 3/15/00
4/12/00 New Jersey
8000
AmEx
332332332 12/10/99
1 3/15/00
4/12/00 New Jersey
8000
Money Order
2 2/10/00
2/17/00 Monterey
4000
Master Card
122122321 11/9/99
4 3/10/00
3/23/00 Florida
24000
Cash
1
5/4/00
5/15/00 Cozumel
1571
Check
2
7/3/00
7/10/00 Florida
6000
Money Order
6 6/20/00
6/30/00 Santa Barbara 36000
Discover 123123123 12/21/00
2 6/10/00
6/17/00 Fiji
10000
Cash
10
4/2/00
4/24/00 Great Barrier Reef
200000
2004.02.05 - SLIDE 20
Line item = DIVEITEM
Order No Item No
307
90010
307
90020
307
90021
307
90030
307
90051
310
90011
310
90045
310
90059
310
90074
310
90078
313
90127
314
90072
314
90094
314
90100
317
90012
IS 257 – Spring 2004
Rental/SaleQty
Rental
Rental
Rental
Rental
Rental
Rental
Rental
Rental
Rental
Rental
Sale
Rental
Rental
Rental
Sale
Line Note
4
1
1
2
2
1
1
1
1
1
1
3
3
3
2
This is our most popular mask.
These are our best selling fins.
A good weight belt for beginners
Holds 10 cubic feet of cargo.
2004.02.05 - SLIDE 21
Shipping information = SHIPVIA
Ship Via
DHL
Emery
FedEx
UPS
US Mail
IS 257 – Spring 2004
Ship Cost
8
11
12
10
6
2004.02.05 - SLIDE 22
Dive Equipment Stock= DIVESTOK
Item No
90010
90011
90012
90020
90021
90022
90023
90024
90025
90030
90031
90032
90033
90040
90041
90042
DescriptionEquipment On
Class
Hand Reorder Point
Cost
Sale Price Rental Price
Shotgun 2 Snorkel - Clear
12
2 $18.00
$30.00
$2.00
Shotgun 2 Snorkel - Red
12
2 $18.00
$30.00
$2.00
Shotgun 2 Snorkel - Teal
11
2 $18.00
$30.00
$2.00
Tri-Vent Mask
Mask
- Clear
14
2 $62.50 $100.00
$5.00
Tri-Vent Mask
Mask
- Red
10
2 $62.50 $100.00
$5.00
Tri-Vent Mask
Mask
- Teal
14
2 $62.50 $100.00
$7.00
Quad Vision
Mask
Mask - Clear
11
2 $48.25
$80.00
$7.00
Quad Vision
Mask
Mask - Red
13
2 $48.25
$80.00
$7.00
Quad Vision
Mask
Mask - Teal
10
2 $48.25
$80.00
$10.00
Sea Wing Fins
Fins - Clear
12
2 $60.00 $100.00
$12.00
Sea Wing Fins
Fins - Red
11
2 $60.00 $100.00
$12.00
Sea Wing Fins
Fins - Teal
12
2 $60.00 $100.00
$12.00
Jet Fin - Black
Fins
14
2 $30.00
$60.00
$10.00
D350 Second
Regulator
Stage
11
1 $162.50 $270.00
$20.00
G250 Second
Regulator
Stage
13
1 $144.50 $240.00
$20.00
G200 Second
Regulator
Stage
12
1 $105.25 $175.00
$20.00
IS 257 – Spring 2004
2004.02.05 - SLIDE 23
Dive Locations = DEST
DestinationDestination
No
Avg
Name
Temp Avg
(F) Temp Spring
(C)
Temp
Spring
(F) Temp
Summer
(C) Temp
Summer
(F) Temp
Fall Temp
(C) (F)
Fall Temp (C)
Winter Temp
Winter
(F) Temp
Accomodations
(C)
Night Life
1 Cozumel
78
25.556
76
24.444
84
28.889
78
25.556
74
23.333 Cheap
Sleepy
2 Great Barrier Reef80
26.667
76
24.444
84
28.889
78
25.556
76
24.444 Moderate Pleasant
3 Monterey
60
15.556
62
16.667
64
17.778
64
17.778
58
14.444 Expensive Wild
4 Santa Barbara
75
23.889
73
22.777
78
25.556
72
22.222
70
21.111 Expensive Wild
5 Florida
77
25
75
23.889
85
29.444
78
25.556
70
21.111 Moderate Pleasant
6 Fiji
75
23.889
76
24.444
80
26.667
74
23.333
70
21.111 Expensive Sleepy
7 New Jersey
57
13.889
57
13.89
60
15.556
58
14.444
53
11.667 Expensive Pleasant
IS 257 – Spring 2004
Body of Water
Travel Cost
Caribbean
1000
Coral Sea
5000
Pacific
2000
Pacific
3000
Caribbean
3000
South Pacific 5000
Atlantic
2000
2004.02.05 - SLIDE 24
Dive Sites = SITE
Site No
DestinationSite
No Name
Site HighlightSiteDistance
NotesDistance
from Depth
Town
from(m)
(ft)Depth
Town (km)
(m) Visibility (ft)Visibility (m)
Current
1001
1 Palancar Reef Reef
10 16.09
100
30.48
150
45.72 Strong
1002
1 Santa Rosa ReefReef
8 12.87
80
24.384
150
45.72 Strong
1003
1 Chancanab ReefR eef
4 6.437
60
18.288
100
30.48 Mild
1004
1 Punta Sur
Reef
13 20.92
120
36.576
175
53.34 Strong
1005
1 Yocab Reef
Reef
6 9.656
50
15.24
100
30.48 Mild
2001
2 Heron Island
Reef
50 80.47
90
27.432
150
45.72 Mild
2002
2 Cod Hole
Fish
45 72.42
50
15.24
150
45.72 Mild
2003
2 Butterfly Bay
Caves
20 32.19
70
21.336
70
21.336 None
2004
2 Wheeler Reef Marine Life
30 48.28
50
15.24
125
38.1 Mild
2005
2 Watanabe
Marine Life
130 209.2
150
45.72
200
60.96 None
3001
3 Point Lobos
Marine Life
3 4.828
60
18.288
75
22.86 None
3002
3 Macabee BeachMarine Life
0.1 0.161
40
12.192
40
12.192 None
3003
3 Pinnacles
Pinnacle
1 1.609
60
18.288
50
15.24 Mild
3004
3 Monastery Beach
Marine Life
3 4.828
50
15.24
40
12.192 Surge
IS 257 – Spring 2004
Skill Level
Intermediate
Intermediate
Beginning
Advanced
Beginning
Intermediate
Beginning
Advanced
Beginning
Intermediate
Beginning
Beginning
Beginning
Beginning
2004.02.05 - SLIDE 25
Sea Life = BIOLIFE
Species NoCategory Common Name Species Name Length (cm)
Length (in)
Notes Graphic
90020 TriggerfishClown TriggerfishBallistoides conspicillum
50 19.685
90030 Snapper Red Emperor
Lutjanus sebae
60 23.622
90050 Wrasse Giant Maori Wrasse
Cheilinus undulatus 229 90.157
90070 Angelfish Blue Angelfish Pomacanthus nauarchus
30 11.811
90080 Cod
Lunartail RockcodVariola louti
80 31.496
90090 Scorpionfish
Firefish
Pterois volitans
38 14.961
90100 ButterflyfishOrnate Butterflyfish
Chaetodon Ornatissimus
19 7.4803
90110 Shark
Swell Shark
Cephaloscyllium ventriosum
102 40.157
90120 Ray
Bat Ray
Myliobatis californica 56 22.047
90130 Eel
California Moray Gymnothorax mordax 150 59.055
90140 Cod
Lingcod
Ophiodon elongatus 150 59.055
IS 257 – Spring 2004
2004.02.05 - SLIDE 26
BIOSITE -- linking relation
Species No Site No
90010
2001
90010
2002
90010
2003
90010
2004
90010
2005
90010
6001
90010
6003
90010
6004
90010
6005
90020
2001
90020
2002
IS 257 – Spring 2004
2004.02.05 - SLIDE 27
Shipwrecks = SHIPWRK
Ship Name Site No
Delaware
7007
F.S.Loop
4004
Gosford
4001
Great Isaac
7002
Lizzie D
7001
Mohawk
7004
R.P. Resor
7006
Star of Scotland 4002
Tolten
7008
USS Moody
4006
Valiant
4003
Category Type
Interest
TonnageLength (ft)
Length (m) Beam (ft)
Beam (m)
Commercial
Steam Freighter
Treasure
1646
252
76.8096
37
11.2776
Commercial
Steam Schooner
Machinery
794
193
58.8264
39
11.8872
Commercial
Barque Rigged
Fixture
Sail
2250
280
85.344
42
12.8016
Commercial
Seagoing Tug
Fixture
1117
185
56.388
37
11.2776
Commercial
Tug/Rumrunner
Treasure
122
84
25.6032
21
6.4008
PassengerOcean Liner
Treasure
8140
402 122.5296
54
16.4592
Commercial
Oil Tanker Treasure
7450
435
132.588 66.8 20.36064
PassengerBritish Q-Boat
Treasure
1250
263
80.1624
35
10.668
Commercial
Freighter Fixture
1858
280
85.344
43
13.1064
Military
WWI Destroyer
Treasure
1308
314
95.7072
31
9.4488
PassengerLuxury Motor
Treasure
Yacht
444 162.4 49.49952
26
7.9248
IS 257 – Spring 2004
Cause
Date Sunk Comments
Passengers/Crew
Survivors
Condition Graph
Fire
66
66 Broken
Deliberate
1/1/47
0
Scattered
Fire
Intact
Collision
4/16/47
27
27 Intact
Unknown 10/19/22
8
0 Intact
Collision
1/25/35
163
118 Scattered
Military
2/28/42
50
2 Broken
Weather
1/22/42
5
4 Broken
Military
3/13/42
28
1 Intact
Deliberate
1/1/33
0
Intact
Fire
12/17/30
25
25 Intact
2004.02.05 - SLIDE 28
Mapping to Other Models
• Hierarchical
– Need to make decisions about access paths
• Network
– Need to pre-specify all of the links and sets
• Object-Oriented
– What are the objects, datatypes, their
methods and the access points for them
• Object-Relational
– Same as relational, but what new datatypes
might be needed or useful (more on OR later)
IS 257 – Spring 2004
2004.02.05 - SLIDE 29
Lecture Outline
• Review
– Database Design cont. Object-Oriented
Modeling
• Logical Design for the Diveshop
database
• Normalization
• Access Database Creation
IS 257 – Spring 2004
2004.02.05 - SLIDE 30
Normalization
• Normalization theory is based on the
observation that relations with certain
properties are more effective in inserting,
updating and deleting data than other sets
of relations containing the same data
• Normalization is a multi-step process
beginning with an “unnormalized” relation
– Hospital example from Atre, S. Data Base:
Structured Techniques for Design,
Performance, and Management.
IS 257 – Spring 2004
2004.02.05 - SLIDE 31
Normal Forms
•
•
•
•
•
•
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
IS 257 – Spring 2004
2004.02.05 - SLIDE 32
Normalization
No transitive
dependency
between
nonkey
attributes
All
determinants
are candidate
keys - Single
multivalued
dependency
IS 257 – Spring 2004
BoyceCodd and
Higher
Functional
dependency
of nonkey
attributes on
the primary
key - Atomic
values only
Full
Functional
dependency
of nonkey
attributes on
the primary
key
2004.02.05 - SLIDE 33
Unnormalized Relations
• First step in normalization is to convert the
data into a two-dimensional table
• In unnormalized relations data can repeat
within a column
IS 257 – Spring 2004
2004.02.05 - SLIDE 34
Unnormalized Relation
Patient #
Surgeon #
145
1111 311
Surg. date
Patient Name
Jan 1,
1995; June
12, 1995
John White
Patient Addr Surgeon
15 New St.
New York,
NY
243
1234 467
2345 189
Jan 8,
1996
Charles Brown
4876 145
Nov 5,
1995
Hal Kane
5123 145
May 10,
1995
Paul Kosher
Charles
Field
10 Main St. Patricia
Rye, NY
Gold
Dogwood
Lane
Harrison,
David
NY
Rosen
55 Boston
Post Road,
Chester,
CN
Beth Little
Blind Brook
Mamaronec
k, NY
Beth Little
6845 243
Apr 5,
1994 Dec
15, 1984
Ann Hood
Hilton Road
Larchmont, Charles
NY
Field
IS 257 – Spring 2004
Postop drug
Drug side effects
Gallstone
s removal;
Beth Little Kidney
Michael
stones
Penicillin,
Diamond removal
none-
Apr 5,
1994 May
10, 1995
Mary Jones
Surgery
rash
none
Eye
Cataract
removal
Thrombos Tetracyclin Fever
is removal e none
none
Open
Heart
Surgery
Cholecyst
ectomy
Gallstone
s
Removal
Eye
Cornea
Replacem
ent Eye
cataract
removal
Cephalosp
orin
none
Demicillin
none
none
none
Tetracyclin
e
Fever
2004.02.05 - SLIDE 35
First Normal Form
• To move to First Normal Form a relation
must contain only atomic values at each
row and column.
– No repeating groups
– A column or set of columns is called a
Candidate Key when its values can uniquely
identify the row in the relation.
IS 257 – Spring 2004
2004.02.05 - SLIDE 36
First Normal Form
Patient #
Surgeon # Surgery DatePatient Name Patient Addr Surgeon Name
1111
145
01-Jan-95 John White
1111
311
12-Jun-95 John White
15 New St.
New York,
NY
15 New St.
New York,
NY
1234
243
05-Apr-94 Mary Jones
10 Main St.
Rye, NY
1234
467
10-May-95 Mary Jones
2345
4876
5123
6845
6845
189
145
145
243
243
IS 257 – Spring 2004
Charles
08-Jan-96 Brown
10 Main St.
Rye, NY
Dogwood
Lane
Harrison,
NY
05-Nov-95 Hal Kane
55 Boston
Post Road,
Chester,
CN
05-Apr-94 Ann Hood
15-Dec-84 Ann Hood
Hilton Road
Larchmont,
NY
Drug adminSide Effects
Charles Field
Gallstone
s removal
Kidney
stones
removal
Eye
Cataract
removal
Patricia Gold
Thrombos
is removal none
none
David Rosen
Open
Heart
Surgery
none
Beth Little
Cholecyst
ectomy
Demicillin
Beth Little
Michael
Diamond
Blind Brook
Mamaronec
10-May-95 Paul Kosher k, NY
Beth Little
Hilton Road
Larchmont,
NY
Surgery
Penicillin
rash
none
none
Tetracyclin
e
Fever
Cephalosp
orin
Charles Field
Gallstone
s
Removal
none
Eye
Cornea
Replacem Tetracyclin
ent
e
Charles Field
Eye
cataract
removal
none
none
none
Fever
none
2004.02.05 - SLIDE 37
1NF Storage Anomalies
• Insertion: A new patient has not yet undergone
surgery -- hence no surgeon # -- Since surgeon
# is part of the key we can’t insert.
• Insertion: If a surgeon is newly hired and hasn’t
operated yet -- there will be no way to include
that person in the database.
• Update: If a patient comes in for a new
procedure, and has moved, we need to change
multiple address entries.
• Deletion (type 1): Deleting a patient record may
also delete all info about a surgeon.
• Deletion (type 2): When there are functional
dependencies (like side effects and drug)
changing one item eliminates other information.
IS 257 – Spring 2004
2004.02.05 - SLIDE 38
Second Normal Form
• A relation is said to be in Second Normal
Form when every nonkey attribute is fully
functionally dependent on the primary key.
– That is, every nonkey attribute needs the full
primary key for unique identification
IS 257 – Spring 2004
2004.02.05 - SLIDE 39
Second Normal Form
Patient #
1111
1234
2345
4876
5123
6845
IS 257 – Spring 2004
Patient Name Patient Address
15 New St. New
John White York, NY
10 Main St. Rye,
Mary Jones NY
Charles
Dogwood Lane
Brown
Harrison, NY
55 Boston Post
Hal Kane
Road, Chester,
Blind Brook
Paul Kosher Mamaroneck, NY
Hilton Road
Ann Hood
Larchmont, NY
2004.02.05 - SLIDE 40
Second Normal Form
Surgeon #
Surgeon Name
145 Beth Little
189 David Rosen
243 Charles Field
311 Michael Diamond
467 Patricia Gold
IS 257 – Spring 2004
2004.02.05 - SLIDE 41
Second Normal Form
Patient # Surgeon # Surgery Date
1111
1111
1234
1234
2345
4876
Drug Admin Side Effects
145
Gallstones
01-Jan-95 removal
Kidney
Penicillin
rash
311
stones
12-Jun-95 removal
none
none
243
Eye Cataract
05-Apr-94 removal
Tetracycline Fever
467
Thrombosis
10-May-95 removal
189
Open Heart
08-Jan-96 Surgery
Cephalospori
n
none
145
Cholecystect
05-Nov-95 omy
Demicillin
none
none
none
none
none
5123
145
6845
243
6845
243
IS 257 – Spring 2004
Surgery
Gallstones
10-May-95 Removal
Eye cataract
15-Dec-84 removal
Eye Cornea
05-Apr-94 Replacement
none
none
Tetracycline Fever
2004.02.05 - SLIDE 42
1NF Storage Anomalies Removed
• Insertion: Can now enter new patients without
surgery.
• Insertion: Can now enter Surgeons who haven’t
operated.
• Deletion (type 1): If Charles Brown dies the
corresponding tuples from Patient and Surgery
tables can be deleted without losing information
on David Rosen.
• Update: If John White comes in for third time,
and has moved, we only need to change the
Patient table
IS 257 – Spring 2004
2004.02.05 - SLIDE 43
2NF Storage Anomalies
• Insertion: Cannot enter the fact that a particular
drug has a particular side effect unless it is given
to a patient.
• Deletion: If John White receives some other drug
because of the penicillin rash, and a new drug
and side effect are entered, we lose the
information that penicillin can cause a rash
• Update: If drug side effects change (a new
formula) we have to update multiple occurrences
of side effects.
IS 257 – Spring 2004
2004.02.05 - SLIDE 44
Third Normal Form
• A relation is said to be in Third Normal Form if
there is no transitive functional dependency
between nonkey attributes
– When one nonkey attribute can be determined with
one or more nonkey attributes there is said to be a
transitive functional dependency.
• The side effect column in the Surgery table is
determined by the drug administered
– Side effect is transitively functionally dependent on
drug so Surgery is not 3NF
IS 257 – Spring 2004
2004.02.05 - SLIDE 45
Third Normal Form
Patient # Surgeon # Surgery Date
Surgery
Drug Admin
1111
145
1111
311
01-Jan-95 Gallstones removal
Kidney stones
12-Jun-95 removal
1234
243
05-Apr-94 Eye Cataract removal Tetracycline
1234
467
10-May-95 Thrombosis removal
2345
189
08-Jan-96 Open Heart Surgery
Cephalosporin
4876
145
05-Nov-95 Cholecystectomy
Demicillin
5123
145
10-May-95 Gallstones Removal
none
6845
243
none
6845
243
15-Dec-84 Eye cataract removal
Eye Cornea
05-Apr-94 Replacement
IS 257 – Spring 2004
Penicillin
none
none
Tetracycline
2004.02.05 - SLIDE 46
Third Normal Form
Drug Admin
IS 257 – Spring 2004
Side Effects
Cephalosporin
none
Demicillin
none
none
none
Penicillin
rash
Tetracycline
Fever
2004.02.05 - SLIDE 47
2NF Storage Anomalies Removed
• Insertion: We can now enter the fact that a
particular drug has a particular side effect
in the Drug relation.
• Deletion: If John White recieves some
other drug as a result of the rash from
penicillin, but the information on penicillin
and rash is maintained.
• Update: The side effects for each drug
appear only once.
IS 257 – Spring 2004
2004.02.05 - SLIDE 48
Boyce-Codd Normal Form
• Most 3NF relations are also BCNF
relations.
• A 3NF relation is NOT in BCNF if:
– Candidate keys in the relation are composite
keys (they are not single attributes)
– There is more than one candidate key in the
relation, and
– The keys are not disjoint, that is, some
attributes in the keys are common
IS 257 – Spring 2004
2004.02.05 - SLIDE 49
Most 3NF Relations are also BCNF – Is
this one?
Patient # Patient Name Patient Address
15 New St. New
1111 John White York, NY
10 Main St. Rye,
1234 Mary Jones NY
Charles
Dogwood Lane
2345 Brown
Harrison, NY
55 Boston Post
4876 Hal Kane
Road, Chester,
Blind Brook
5123 Paul Kosher Mamaroneck, NY
Hilton Road
6845 Ann Hood
Larchmont, NY
IS 257 – Spring 2004
2004.02.05 - SLIDE 50
BCNF Relations
Patient # Patient Name
IS 257 – Spring 2004
Patient #
1111 John White
1111
1234 Mary Jones
Charles
2345 Brown
1234
4876 Hal Kane
4876
5123 Paul Kosher
5123
6845 Ann Hood
6845
2345
Patient Address
15 New St. New
York, NY
10 Main St. Rye,
NY
Dogwood Lane
Harrison, NY
55 Boston Post
Road, Chester,
Blind Brook
Mamaroneck, NY
Hilton Road
Larchmont, NY
2004.02.05 - SLIDE 51
Fourth Normal Form
• Any relation is in Fourth Normal Form if it
is BCNF and any multivalued
dependencies are trivial
• Eliminate non-trivial multivalued
dependencies by projecting into simpler
tables
IS 257 – Spring 2004
2004.02.05 - SLIDE 52
Fifth Normal Form
• A relation is in 5NF if every join
dependency in the relation is implied by
the keys of the relation
• Implies that relations that have been
decomposed in previous NF can be
recombined via natural joins to recreate
the original relation.
IS 257 – Spring 2004
2004.02.05 - SLIDE 53
Effectiveness and Efficiency Issues for
DBMS
• Focus on the relational model
• Any column in a relational database can
be searched for values.
• To improve efficiency indexes using
storage structures such as BTrees and
Hashing are used
• But many useful functions are not
indexable and require complete scans of
the the database
IS 257 – Spring 2004
2004.02.05 - SLIDE 54
Example: Text Fields
• In conventional RDBMS, when a text field
is indexed, only exact matching of the text
field contents (or Greater-than and Lessthan).
– Can search for individual words using pattern
matching, but a full scan is required.
• Text searching is still done best (and
fastest) by specialized text search
programs (Search Engines) that we will
look at more later.
IS 257 – Spring 2004
2004.02.05 - SLIDE 55
Normalizing to death
• Normalization splits database information
across multiple tables.
• To retrieve complete information from a
normalized database, the JOIN operation
must be used.
• JOIN tends to be expensive in terms of
processing time, and very large joins are
very expensive.
IS 257 – Spring 2004
2004.02.05 - SLIDE 56
Advantages of RDBMS
• Possible to design complex data storage
and retrieval systems with ease (and
without conventional programming).
• Support for ACID transactions
– Atomic
– Consistent
– Independent
– Durable
IS 257 – Spring 2004
2004.02.05 - SLIDE 57
Advantages of RDBMS
• Support for very large databases
• Automatic optimization of searching (when
possible)
• RDBMS have a simple view of the
database that conforms to much of the
data used in businesses.
• Standard query language (SQL)
IS 257 – Spring 2004
2004.02.05 - SLIDE 58
Disadvantages of RDBMS
• Until recently, no real support for complex
objects such as documents, video,
images, spatial or time-series data.
(ORDBMS add support for these).
• Often poor support for storage of complex
objects from OOP languages
(Disassembling the car to park it in the
garage)
• Still no efficient and effective integrated
support for things like text searching within
fields.
IS 257 – Spring 2004
2004.02.05 - SLIDE 59
Lecture Outline
• Review
– Database Design -- Object-Oriented
Modeling
• Logical Design for the Diveshop
database
• Normalization
• Access Database Creation
IS 257 – Spring 2004
2004.02.05 - SLIDE 60
Database Creation in Access
• Simplest to use a design view
– wizards are available, but less flexible
• Need to watch the default values
• Helps to know what the primary key is, or
if one is to be created automatically
– Automatic creation is more complex in other
RDBMS and ORDBMS
• Need to make decision about the physical
storage of the data
IS 257 – Spring 2004
2004.02.05 - SLIDE 61
Database Creation in Access
• Some Simple Examples
IS 257 – Spring 2004
2004.02.05 - SLIDE 62
Next Week
• More Database Design
• Expanding and redesigning DiveShop
IS 257 – Spring 2004
2004.02.05 - SLIDE 63