Normal Form Decompositions
Download
Report
Transcript Normal Form Decompositions
Multi-valued Dependencies
and Fourth Normal Form
COSC 6340
Topics Covered
1.
2.
3.
Definition of Multivalued Dependencies
Reasoning about MVDs
Fourth Normal Form
Ch. Eick: 4thNF and MVD's
2
Motivation
There are schemas that are in BCNF that do
not seem to be sufficiently normalized
Stars
name
street
city
title
year
C. Fisher 123 Maple Str. Hollywood
Star Wars
1977
C. Fisher 5 Locust Ln.
Star Wars
1977
Malibu
C. Fisher 123 Maple Str. Hollywood Empire Strikes Back
1980
C. Fisher 5 Locust Ln.
Empire Strikes Back
1980
C. Fisher 123 Maple Str. Hollywood
Return of the Jedi
1983
C. Fisher 5 Locust Ln.
Return of the Jedi
1983
Ch. Eick: 4thNF and MVD's
Malibu
Malibu
3
Attribute Independence
No reason to associate address with one
movie and not another
When we repeat address and movie facts in
all combinations, there is obvious redundancy
However, NO BCNF violation in Stars relation
There are no non-trivial FD’s at all, all five
attributes form the only superkey
Why?
Ch. Eick: 4thNF and MVD's
4
Multi-valued Dependency
Definition: Multivalued dependency (MVD):
A1A2…An B1B2…Bm holds for relation R if:
For all tuples t, u in R
If t[A1A2...An] = u[A1A2...An], then there exists a v in R
such that:
(1) v[A1A2...An] = t[A1A2...An] = u[A1A2...An]
(2) v[B1B2…Bm] = t[B1B2…Bm]
(3) v[C1C2…Ck] = u[C1C2…Ck], where C1C2…Ck is all
attributes in R except (A1A2...An B1B2…Bm)
Ch. Eick: 4thNF and MVD's
5
Pictorially Speaking...
A’s
B’s
Others
t
v
u
w
An MVD guarantees v exists
The existence of a fourth tuple w is implied by
interchanging t and u
Ch. Eick: 4thNF and MVD's
6
Example: name street city
Stars
name
t
street
city
title
year
C. Fisher 123 Maple Str. Hollywood
Star Wars
1977
C. Fisher 5 Locust Ln.
Star Wars
1977
Malibu
v
C. Fisher 123 Maple Str. Hollywood Empire Strikes Back
1980
u
C. Fisher 5 Locust Ln.
Empire Strikes Back
1980
C. Fisher 123 Maple Str. Hollywood
Return of the Jedi
1983
C. Fisher 5 Locust Ln.
Return of the Jedi
1983
Ch. Eick: 4thNF and MVD's
Malibu
Malibu
7
Example: name street city
Stars
name
street
city
title
year
u
C. Fisher 123 Maple Str. Hollywood
Star Wars
1977
w
v
C. Fisher 5 Locust Ln.
Star Wars
1977
C. Fisher 123 Maple Str. Hollywood Empire Strikes Back
1980
t
C. Fisher 5 Locust Ln.
Empire Strikes Back
1980
C. Fisher 123 Maple Str. Hollywood
Return of the Jedi
1983
C. Fisher 5 Locust Ln.
Return of the Jedi
1983
Ch. Eick: 4thNF and MVD's
Malibu
Malibu
Malibu
8
More on MVDs
Intuitively, A1A2…An B1B2…Bm says that the
relationship between A1A2…An and B1B2…Bm is
independent of the relationship between A1A2…An and R
-{B1B2…Bm}
Functional dependencies rule out certain tuples from
being in a relation
MVD's uncover situations where independent facts related to a
certain object are being squished together in one relation
How?
Multivalued dependencies require that other tuples of a
certain form be present in the relation
a.k.a. tuple-generating dependencies
Ch. Eick: 4thNF and MVD's
9
Let’s Illustrate
In Stars, we must repeat the movie (title, year) once
for each address (street, city) a movie star has
Alternatively, we must repeat the address for each movie a star
has made
Example: Stars with name street city
name
street
city
C. Fisher 123 Maple Str. Hollywood
C. Fisher 5 Locust Ln.
Malibu
C. Fisher 123 Maple Str. Hollywood
title
year
Star Wars
1977
Empire Strikes Back
1980
Return of the Jedi
1983
Is an incomplete extent of Stars
Infer the existence of a fourth tuple under the given MVD
Ch. Eick: 4thNF and MVD's
10
Trivial MVDs
Trivial MVD
A1A2…An B1B2…Bm where B1B2…Bm
is a subset of A1A2…An or (A1A2…An
B1B2…Bm ) contains all attributes of R
Ch. Eick: 4thNF and MVD's
11
Reasoning About MVDs
FD-IS-AN-MVD Rule (Replication)
If A1A2…An B1B2…Bm then
A1A2…An B1B2…Bm holds
Ch. Eick: 4thNF and MVD's
12
Reasoning About MVDs
COMPLEMENTATION Rule
If A1A2…An B1B2…Bm then A1A2…An C1C2…Ck
where C1C2…Ck is all attributes in R except (A1A2…An
B1B2…Bm )
AUGMENTATION Rule
If XY and WZ then WX YZ
TRANSITIVITY Rule
If XY and YZ then X (Z-Y)
Ch. Eick: 4thNF and MVD's
13
Coalescence Rule for MVD
X Y
If:
W:W Z
Then: X Z
Remark: Y and W have to be disjoint and Z has to be a subset of or
equal to Y
Ch. Eick: 4thNF and MVD's
14
Definition 4NF
Given: relation R and set of MVD's for R
Definition: R is in 4NF with respect to its
MVD's if for every non-trivial MVD
A1A2…AnB1B2…Bm , A1A2…An is a
superkey
Note: Since every FD is also an MVD, 4NF
implies BCNF
Example: Stars is not in 4NF
Ch. Eick: 4thNF and MVD's
15
Decomposition Algorithm
(1) apply closure to the user-specified FD's and MVD's**:
(2) repeat until no more 4NF violations:
if R with AA ->> BB violates 4NF then:
(2a) decompose R into R1(AA,BB) and
R2(AA,CC), where CC is all
attributes in R except (AA BB)
(2b) assign FD's and MVD's to the new relations**
** MVD's: hard problem!
No simple test analogous to computing the attribute closure for
FD’s exists for MVD’s. You are stuck to have to use the 5
inference rules for MVD’s when computing the closure!
Ch. Eick: 4thNF and MVD's
16
Exercise
Decompose Stars into a set of relations
that are in 4NF.
namestreet city is a 4NF
violation
Apply decomposition:
R(name, street, city)
S(name, title, year)
What about namestreet city in R
and nametitle year in S?
Ch. Eick: 4thNF and MVD's
17
Exercise
For the relation R(A,B,C,D) with only
MVD’s AB and AC find all 4NF
violations and decompose R into a
collection of relation schemas in 4NF.
Ch. Eick: 4thNF and MVD's
18
Solution
Since there are no functional dependencies, the
only key is all four attributes, ABCD.
Thus, each of the nontrivial multivalued
dependencies A->->B and A->->C violate 4NF.
Separate out the attributes of these
dependencies, first decomposing into AB and
ACD
Then decompose the latter into AC and AD
because A->->C is still a 4NF violation for ACD.
The final set of relations are AB, AC, and AD.
Ch. Eick: 4thNF and MVD's
19
Exercise
Suppose we have relation R(A,B,C) with
MVD AB. If we know that the
tuples (a,b1,c1), (a,b2,c2), and (a,b3,c3)
are in the current instance of R, what
other tuples do we know must also be
in R?
Ch. Eick: 4thNF and MVD's
20
Solution
Since A->->B, and all the tuples have the same
value for attribute A, we can pair the B-value
from any tuple with the value of the remaining
attribute C from any other tuple.
Thus, we know that R must have at least the
nine tuples of the form (a,b,c), where b is any
of b1, b2, or b3, and c is any of c1, c2, or c3.
That is, we can derive, using the definition of a
multivalued dependency, that each of the tuples
(a,b1,c2), (a,b1,c3), (a,b2,c1), (a,b2,c3),
(a,b3,c1), and (a,b3,c2) are also in R.
Ch. Eick: 4thNF and MVD's
21
Another View of 4NF
True MVD’s
MVD’s that are also
FD’s
Trivial
‘s
MVD’s
XY and XY
XY and not XY
Ch. Eick: 4thNF and MVD's
XY
True MVD XY:= non-trivial &
XY does not hold
Remark: If XY is a
true MVD then X cannot
be a superkey (because
XY does not hold);
Therefore, true MVD’s
always violate 4NF (“true
MVD’s are always bad)
4NF:= Relation is in
BCNF and there are
no true MVD’s (yellow
part is empty)
22
H1-2005-Problem8
8) Normalization [6] graded
R(A,B,C,D,E,F) is given with: (1) ABCD (2)CDAB (3)ABF (4) FE
a)
What are the candidate keys of relation R? [1]
b)
b) Transform R into a relational schema that is in BCNF and does not
have any lost functional dependencies! [5]
Correct Solution:
a)
Candidate keys: AB and CD
b)
Decompose R into R1(A,B,C,D,F) with local FD’s (1), (2), (3) and
R2(E,F) with local FD’s (4) Due to the fact that all four dependencies are
still present no functional dependency has been lost. Moreover, all
functional depencies are good
A non-optimal (“too many relations”) solution I also saw was: Decompose R
into R1(A,B,C,D) with local functional dependencies ABCD and
CDAB, R2(A,B,F) with local functional dependencies ABF and
R3(F,E) with local functional dependencies FA..
Ch. Eick: 4thNF and MVD's
23
Problem 1; H1-2004
a.
b.
c.
Candidate keys are: {a,b}, {a,d}, {a,e}
14 superkeys total
All but the first functional dependency
are bad
Ch. Eick: 4thNF and MVD's
24
Problem 2; H2-04
a.
b.
c.
d.
e.
f.
No; EBC is a “true” multi-valued dependency and E is not a
candidate key (as a matter of fact {E}+={A,D,E,F} see below)
No (but just mentioning neither E ABC nor E CF holds is not
sufficient (e.g. if EABC holds then the decomposition is
lossless!) ) --- a counter examples should be given to show that
the statement is false!
Yes C is candidate key; therefore CBDEF; therefore CBDEF
Yes E BC and BC BCD implies ED due to MVDtransitivity (CCDBCBCD BCBCD)
Yes EBC; therefore EADF; moreover, CADF and using
the Coalescence Rule we obtain EADF; therefore, EA holds
No R is not in BCNF because EADF holds and E is not a candidate
key.
Ch. Eick: 4thNF and MVD's
25
Problem 3a-2004
From AB and AC we can infer:
ABC??
(1) AC AAC
(2) AB ACABC
(3) ACABC ACDE
(4) AAC, ACDE ADE Wrong!!
(5) ADE ABC
Using: 1. Augmentation, 2.Augmentation, 3.Complementation, 4.Transitivity,
5. Complementation
Remark: This problem will be revised in Homework3-2005; it is too complicated to
worry about it for the midterm exam!
Ch. Eick: 4thNF and MVD's
26
MDV’s and FD’s --- Ungraded
Homework
Assume we have a relation R(A,B,C,D,E) with the following dependencies:
(1) AB CDE
(2) CD ABE
(3) E DB
Answer the following questions giving reasons for your answers:
a) Is R in BCNF? ????? (answer after Spring break) Warning: The
presence of the MVD might imply other functional dependencies (see
textbook page 637)
b)
c)
d)
Does ABE D hold for R? yes
Does CD B hold for R? yes
Does E D always hold for R (either show that this dependency
can be inferred from the given 3 dependencies, or give a counter
example of a relation that satisfies (1), (2), (3) but violates ED)?
No
Ch. Eick: 4thNF and MVD's
27