No Slide Title

Download Report

Transcript No Slide Title

Integrity Constraints
Review
Three things managed by a DBMS
1. Data organization


E/R Model
Relational Model
2. Data Retrieval


Relational Algebra
Relational Calculus

SQL
3. Data Integrity and Database Design


Integrity Constraints
Functional Dependencies

Normalization
Integrity Constraints
Purpose: prevent semantic inconsistencies in data
e.g.:
e.g.:
cname
John
cname
Jones
Turner
Smith
svngs
100
bname
Kenmore
Main St
Union
4 kinds of IC’s:
1. Key Constraints
2. Attribute Constraints
3. Referential Integrity Constraints
4. Global Constraints
check
200
total
250
bname
Main St
Union
Union
bcity
Boston
NY
NY
No entry for Kenmore...
???
IC’s
What are they?
 predicates on the database
 must always be true (:, checked whenever db gets
updated)
There are the following 4 types of IC’s:
Key constraints (1 table)
e.g., 2 accts can’t share the same acct_no
Attribute constraints (1 table)
e.g., 2 accts must have nonnegative balance
Referential Integrity constraints ( 2 tables)
E.g. bnames associated w/ loans must be names of real branches
Key Constraints
Idea: specifies that a relation is a set, not a bag
SQL examples:
1. Primary Key:
CREATE TABLE branch(
bname CHAR(15) PRIMARY KEY,
bcity
CHAR(20),
assets INT);
or
CREATE TABLE depositor(
cname CHAR(15),
acct_no CHAR(5),
PRIMARY KEY(cname, acct_no));
2. Candidate Keys:
CREATE TABLE customer (
ssn CHAR(9) PRIMARY KEY,
cname CHAR(15),
address CHAR(30),
city
CHAR(10),
UNIQUE (cname, address, city);
Key Constraints
Effect of SQL Key declarations
PRIMARY (A1, A2, .., An) or
UNIQUE (A1, A2, ..., An)
Insertions: check if any tuple has same values for A1, A2, .., An as any
inserted tuple. If found, reject insertion
Updates to any of A1, A2, ..., An: treat as insertion of entire tuple
Primary vs Unique (candidate)
1. 1 primary key per table, several unique keys allowed.
2. Only primary key can be referenced by “foreign key” (ref
integrity)
3. DBMS may treat primary key differently
(e.g.: implicitly create an index on PK)
4. NULL values permitted in UNIQUE keys but not in PRIMARY KEY
Attribute Constraints
Idea:
 Attach constraints to values of attributes
 Enhances types system (e.g.: >= 0 rather than integer)
In SQL:
1. NOT NULL
e.g.: CREATE TABLE branch(
bname CHAR(15) NOT NULL,
....
)
Note: declaring bname as primary key also prevents null values
2. CHECK
e.g.: CREATE TABLE depositor(
....
balance int NOT NULL,
CHECK( balance >= 0),
....
)
affect insertions, update in affected columns
CHECK constraint in Oracle
CHECK cond where cond is:
 Boolean expression evaluated using the values in the row being
inserted or updated, and
 Does not contain subqueries; sequences; the SQL functions SYSDATE,
UID, USER, or USERENV; or the pseudocolumns LEVEL or ROWNUM
Multiple CHECK constraints
 No limit on the number of CHECK constraints you can define on a
column
CREATE TABLE credit_card(
....
balance int NOT NULL,
CHECK( balance >= 0),
CHECK (balance < limit),
....
)
Referential Integrity Constraints
Idea: prevent “dangling tuples” (e.g.: a loan with a bname of
‘Kenmore’ when no Kenmore tuple is not in branch table)
Referencing
Relation
(e.g. loan)
“foreign key”
bname
Referenced
Relation
(e.g. branch)
primary key
bname
Ref Integrity:
ensure that:
foreign key value 
primary key value
(note: need not to ensure , i.e., not all branches have to have loans)
Referential Integrity Constraints
bname
child
Referencing
Relation
(e.g. loan)
x
x
bname
x
Referenced
Relation
(e.g. branch)
In SQL:
CREATE TABLE branch(
bname CHAR(15) PRIMARY KEY
....)
CREATE TABLE loan (
.........
FOREIGN KEY bname REFERENCES branch);
Affects:
1) Insertions, updates of referencing relation
2) Deletions, updates of referenced relation
parent
Referential Integrity Constraints
c
ti
x
tj
x
c
parent
x
child
A
B
what happens when
we try to delete
this tuple?
Ans: Oracle allows the following possibilities
• No action
• RESTRICT: reject deletion/ update
• SET TO NULL: set ti [c], tj[c] = NULL
• SET TO DEFAULT: set ti [c], tj[c] = default_val
• CASCADE: propagate deletion/update
DELETE: delete ti, tj
UPDATE: set ti[c], tj[c] to updated values
Referential Integrity Constraints
c
ti
x
tj
x
Emp
c
x
Dept
what happens when
we try to delete
this tuple?
ALTER TABLE Dept ADD Primary Key (deptno);
ALTER TABLE Emp ADD FOREIGN KEY (Deptno)
REFERENCES Dept(Deptno) [ACTION];
Action:
1) ON DELETE NO ACTION left blank (deletion/update rejected)
2) ON DELETE SET NULL/ ON UPDATE SET NULL
sets ti[c] = NULL, tj[c] = NULL
3) ON DELETE CASCADE
deletes ti, tj
ON UPDATE CASCADE
sets ti[c], tj[c] to new key values
Global Constraints
Idea: two kinds
1) single relation (constraints spans multiple columns)
E.g.: CHECK (total = svngs + check) declared in the CREATE TABLE
Example: All Bkln branches must have assets > 5M
CREATE TABLE branch (
..........
bcity CHAR(15),
assets INT,
CHECK (NOT(bcity = ‘Bkln’) OR assets > 5M))
Affects:
insertions into branch
updates of bcity or assets in branch
2) Multiple Relations: NOT supported in Oracle
Need to be implemented as a Trigger
Global Constraints (NOT in Oracle)
SQL example:
2) Multiple relations: every loan has a borrower with a savings account
CHECK (NOT EXISTS (
SELECT *
FROM
loan AS L
WHERE NOT EXISTS(
SELECT *
FROM borrower B, depositor D, account A
WHERE B.cname = D.cname AND
D.acct_no = A.acct_no AND
L.lno = B.lno)))
Problem: Where to put this constraint? At depositor? Loan? ....
Ans: None of the above:
CREATE ASSERTION loan-constraint
CHECK( ..... )
Checked with EVERY DB update!
very expensive.....
Global Constraints
Issues:
1) How does one decide what global constraint to
impose?
2) How does one minimize the cost of checking the
global constraints?
Ans: Functional dependencies.
but before we go there
Deferring the constraint checking
 SET ALL CONSTRAINTS DEFERRED;
 Defers all constraint checks till the end of the transaction
 Especially useful in enforcing Referential integrity
 Insert new rows into ‘Child’ table but referred key is not yet in Parent
 Insert corresponding row in ‘Parent’ table
 Constraint checking done at the end of the transaction
 Can also defer individual constraint checking by
specifying the constraint name
 Finding the constraint information in Oracle
 SELECT * FROM USER_CONSTRAINTS;
 SELECT * FROM USER_CONS_COLS;
Summary: Integrity Constraints
Constraint Type
Where declared
Affects...
In Oracle ?
Key Constraints
CREATE TABLE
Insertions, Updates
Yes
(PRIMARY KEY, UNIQUE)
CREATE TABLE
Insertions, Updates
Yes
Attribute
Constraints
Referential
Integrity
CREATE DOMAIN
(Not NULL, CHECK)
Table Tag
(FOREIGN KEY ....
REFERENCES ....)
1.Insertions into
referencing rel’n
2. Updates of
referencing rel’n of
relevant attrs
3. Deletions from
referenced rel’n
4. Update of
referenced rel’n
Global Consraints
Table Tag (CHECK)
or
outside table
(CREATE ASSERTION)
1. For single rel’n
constraint, with
insertion, deletion
of relevant attrs
2. For assesrtions
w/ every db
modification
CREATE DOMAIN
not supported in
Oracle
Yes
Possible Actions:
-- Update/delete no
aciton
-- delete CASCADE
-- delete SET NULL,
SET DEFAULT,…
Assertions, domains
not supported in
Oracle.
Review
Three things managed by a DBMS
1. Data organization


E/R Model
Relational Model
2. Data Retrieval


Relational Algebra
Relational Calculus

SQL
3. Data Integrity and Database Design



Integrity Constraints
Functional Dependencies

Constraints that hold for legal instance of the database

Example: Every customer should have a single credit card
Normalization
Functional Dependencies
A
a
a
a
b
b
B





C
1
1
5
3
3
D
U
V
W
W
W
ABC
“ AB determines C”
two tuples with the same values for A and B
will also have the same value for C
Constraints that will hold on all “legal” instances of the database for the
specific business application.
 In most cases, specified by a database designer/business architect
Functional Dependencies
A
a
a
a
b
b
Shorthand:
C  BD
B





C
1
1
5
3
3
D
U
U
W
W
W
same as C  B
CD
Be careful!
AB  C not the same as
AC
BC
Not true
Functional Dependencies
Example:
suppose R = { A, B, C, D, E, H} and we determine that:
F = { A  BC,
B  CE,
A  E,
AC  H,
D  B}
Then we determine the ‘canonical cover’ of F:
Fc = { A  BH,
B  CE,
D  B}
ensuring that F and Fc are equivalent
Note:
F requires 5 assertions
Fc requires 3 assertions
Canonical cover (or minimal cover) algorithm: In the book
(not covered here).
Functional Dependencies
Equivalence of FD sets:
FD sets F and G are equivalent if the imply the same set of FD’s
e.g. A B and B  C : implies A  C
equivalence usually expressed in terms of closures
Closures:
For any FD set, F, F+ is the set of all FD’s implied by F.
can calculate in 2 ways:
(1) Attribute Closure
(2) Armstrong’s axioms
Both techniques tedious-- will do only for toy examples
F equivalent to G iff F+ = G+
Armstrong’s Axioms
A. Fundamental Rules (W, X, Y, Z: sets of attributes)
1. Reflexivity
If Y

X then X  Y
2. Augmentation
If X  Y then
WX  WY
3. Transitivity
If X Y and Y  Z then XZ
B. Additional rules (can be proved from A)
4. UNION:
If X  Y and X  Z then X  YZ
5. Decomposition: If X  YZ then X  Y, X Z
6. Pseudotransitivity: If X  Y and WY  Z then WX Z
2
Proving 4.(sketch): X Y => XXXY =>XXY
XYYZ
For every step we used the rules from A.

3
=> X YZ
FD Closures Using Armstrong’s Axioms
Given;
F = { A  BC,
B  CE,
A  E,
AC  H,
D  B}
(1)
(2)
(3)
(4)
(5)
Exhaustively apply Armstrong’s axioms to generate F+
F+ = F 
1.
2.
3.
4.
5.
{ A  B, A  C}: decomposition on (1)
{ A  CE}: transitivity to 1.1 and (2)
{ B  C, B  E}: decomp to (2)
{ A  C, A  E} decomp to 2
{ A  H} pseudotransitivity to 1.2 and (4)
Attribute Closures
Given;
R = { A, B, C, D, E, H,I} and:
F = { A  BC,
C D,
CE,
AH  I}
What is the closure of A (A+) ?
Attribute closure A
Iteration
Result
----------------------------------0
A
1
ABC
2
AB CD
3
ABCDE
Algorithm att-closure (X: set of Attributes)
Result  X
repeat until stable
for each FD in F, Y  Z, do
if Y
Result then
Result  Result  Z

Better to determine if a set of attributes is a key
Functional dependencies
Our goal:
given a set of FD set, F, find an alternative FD set, G that is:
smaller
equivalent
Bad news:
Testing F=G (F+ = G+) is computationally expensive
Good news:
Canonical Cover algorithm:
given a set of FD, F, finds minimal FD set equivalent to F
Minimal: can’t find another equivalent FD set w/ fewer FD’s
FD so far...
1. Canonical Cover algorithm
• result (Fc) guaranteed to be the minimal FD set equivalent to F
2. Closure Algorithms
a. Armstrong’s Axioms:
more common use: test for extraneous attributes
in C.C. algorithm
b. Attribute closure:
more common use: test for superkeys
3. Purposes
a. minimize the cost of global integrity constraints
so far: min gic’s = |Fc|
In fact.... Min gic’s = 0
(FD’s for “normalization”)
Another use of FD’s: Schema Design
Example:
R=
bname
Downtown
Downtown
Mianus
Downtown
bcity
Bkln
Bkln
Horse
Bkln
assets
9M
9M
1.7M
9M
cname
Jones
Johnson
Jones
Hayes
lno
L-17
L-23
L-93
L-17
amt
1000
2000
500
1000
R: “Universal relation”
tuple meaning: Jones has a loan (L-17) for $1000 taken out at the Downtown
branch in Bkln which has assets of $9M
Design:
+:
-:
fast queries (no need for joins!)
redudancy:
update anomalies
examples?
deletion anomalies