cs610a - Arms-A

Download Report

Transcript cs610a - Arms-A

CS610 / CS710 Database Systems I
Daisy Wong
1
© D. Wong 2003
2002
Week 1
2
© D. Wong 2003
2002
Objectives

Modeling and design of databases.

Programming: queries and DB operations
3
© D. Wong 2003
2002
Database
A large
collection of data:
–stored in mass storage
–exists over a long period of time
–Can take on a variety of appearances depending
on the requirements at the time
–Can serve as the data source for a variety of
applications
4
© D. Wong 2003
2002
Database Examples

Examples:
– Wal-Mart : records every item purchased in
every store
– L. L. Bean: records detail information of each
customer and their purchases
– Hospitals: record patient demographics,
conditions and progress, test results, etc
–...

This course, refer to a collection of data managed
by a Database Management System (DBMS)
5
© D. Wong 2003
2002
DBMS

A combination of software, data, and structure of
the data that support:
– Users to create databases and specify their
schema
– Users to query and modify the data
– Storage of very large amount of data, secure the
data from accident or unauthorized use, and
allow efficient access
– Control concurrent access from many users,
presenting correct data to each user, and
prevent accidental corruption of the data from
simultaneous accesses
6
© D. Wong 2003
2002
DBMS lingo

Schema
– logical structure of the data. Use Data Definition
Language (DDL)

Query
– A question about the data. Use Data
Manipulation Language (DML)

Transaction (or atomic transaction)
– A logical unit of work that must be completed as
a whole or not at all.
7
© D. Wong 2003
2002
Three levels of database schemas
External View1
External View2
External View3
External level
Logical to external mappings
Logical Schema
Logical level
Internal to Logical mappings
Disk
Internal schema
8
Internal level
© D. Wong 2003
2002
DBMS User Roles




End users
– access the database for information to do their jobs.
Casual (use simple user interfaces) vs. Sophisticated (
use DML)
Database designers
– specify information content (use DDL) to create
database systems
Application developers
– design and develop applications that extend the
functionality of the dbms. E.g. user interface, data
analysis and data mining, various business services
Database administrators (DBAs)
– administer databases: control access, maintain data
accuracy and integrity, monitor and improve database
9
© D. Wong 2003
2002
performance
A little history

First attempt – file systems

Hierarchical model (tree based)

Network model (graph base)

Relational model
– Proposed by E. F. Codd (1970)
– Data should be presented to user as tables
(relations)
– Queries expressed in a very high-level language
– SQL (Structured Query Language) – most
important language based on relational model
10
© D. Wong 2003
2002
Relational Model


A conceptual model that represents data as relations.
Relations – tables of data
Accounts
accountNo
12345
34567
…
name
Sally
Sue
…
balance
1000.21
285.48
…

Query using SQL:
SELECT balance FROM Accounts
WHERE accountNo = 34567;

Relational DBMS finds an efficient way to answer the
query
11
© D. Wong 2003
2002
Major components of a DBMS (Simplified. Figure 1.1)
External
level
Logical
level
Users / applications
DBA
DDL commands
DDL
compiler
Queries/updates
Query
Manager
Transaction
Commands
Transaction
Manager
Buffer/File
Storage
Manager
Internal
level
Data
Metadata
12
© D. Wong 2003
2002
Figure 1.1: Database management system components
Ref. FCDS 2ed. by Ullman
13
© D. Wong 2003
2002
Storage Manager
• Obtains the requested information from data storage
• Modifies the information if requested and re-store
• Indexes are used (data structures that help us find data
items quickly given a part of their value). Advanced data
structures such as B-tree are used for efficient access
• Indexes are part of the data, their description is part of the
metadata
• Consists of 2 components: file manger and buffer manager
1. File manager keeps track of the location of files on disk
and obtains the blocks containing the requested data
2. Buffer manager handles main memory. It manages the
memory blocks, obtaining disk blocks from the file
manager, trying to optimize the access to data
14
© D. Wong 2003
2002
Query Manager

Parse and optimize the query using a query
compiler

Execute the resulting query plan (sequence of
actions for the DBMS to perform)
– Issues a sequence of requests to storage
manager for small pieces of data

Return the result to the requester
15
© D. Wong 2003
2002
Transaction Manager


Assure all transactions are executed properly
ACID properties of “proper” execution:
– Atomicity : All of the updates of a transaction are
successful, or no update take place
– Consistency: Each transaction should leave the
database in a consistent state
– Isolation: Each transaction, when executed
concurrently with other transactions, should have the
same effect as if it had been executed by itself
– Durability: Once a transaction has completed
successfully, its changes to the database should be
permanent. Even serious failures should not affect the
permanence of a transaction.
16
© D. Wong 2003
2002
Techniques to enforce ACID

Locking – granularity of locks is important.

Logging – write a log to nonvolatile storage.
Assure durability.

Transaction Commitment – for durability and
atomicity, transactions are computed
“tentatively”, recorded, but no changes are made
to the db until the transaction gets committed.
Changes copied to the log, then copied to db.
17
© D. Wong 2003
2002
Trends





Object Oriented DB
– Richer data types
– Classes and class hierarchy enable share or reuse of sw
and schemas
– Protect misuse through abstract data types
Object Relational DB
Constraints and triggers handling
Multimedia data
Data Integration to support advance data analysis such as
data mining. E.g. data warehouses, data marts

Multi-tier Client-Server architecture, move more
processing to the client
Parallel processing

Support Web sites

18
© D. Wong 2003
2002
Knowledge Discovery in Databases (KDD)
Data Mining
Interpretation
/ Evaluation
Transformation
Preprocessing
Knowledge
Selection
Patterns
Preprocessed
Data
Data
Transformed
Data
Target
Data
Reference: Fayyad; Smyth: "From Data Mining to Knowledge Discovery: An Overview" 1996
19
© D. Wong 2003
2002
Ch. 2 Entity-Relationship Data Model

Data models

Entity-Relationship diagrams

Design Principles

Modeling of constraints

Weak entity sets
20
© D. Wong 2003
2002
Data Modeling

Used for conceptual database design
Object-oriented
Classes /
Objects
DBMS
ODL
Ideas
Relational
Schema
Relational
DBMS
E/R
21
© D. Wong 2003
2002
Data Models
Value-Oriented
Examples
Object-Oriented
–Relational
–Object-oriented
–Logic
–Network
–Hierarchical
Distinct objects by
Data values
Redundancy
handling
Reduced by Design Use pointers to
capabilities
objects
Many-to-many
relationships
Handled the same
22
Object identity
Only binary
relationships
© D. Wong 2003
2002
Relational Model

Based on mathematically defined relations of
entities

Consists of:
– Attributes (fields)
– Domain legitimate values of attributes (data
range)
– Views of data presented in table format
– Create new views by projections of the database
– Records are n-tuples, where n = # of attributes
23
© D. Wong 2003
2002
Entity-Relationship (E/R) Diagrams

Represents the schematic of a database

Useful for designing the conceptual model

Entity set : classes of objects

Entity : member of an entity set (data)

Attributes : properties of the entities in an entity
set

Relationship – describe how entities relate to each
other
24
© D. Wong 2003
2002
Entity-Relationship Diagrams Symbols
attributes
relationships
entity sets
one - one
Cardinality:
m
m
many - one
n
25
many - many
© D. Wong 2003
2002
Entity-Relationship Diagrams (continue)

Entity
– Has a set of attributes
– The key is underlined

SS#
Key
– Attributes which uniquely identify an entity in
an entity set
– An inherent property of the data (e.g. movie
title and year)
– Serve as a constraint
26
© D. Wong 2003
2002
Week 2
27
© D. Wong 2003
2002
Ch. 2 Entity-Relationship Data Model (continue)

Data models

Entity-Relationship diagrams

Design Principles

Modeling of constraints

Weak entity sets
28
© D. Wong 2003
2002
Entity-Relationship Diagrams Symbols
attributes
relationships
entity sets
Cardinality (Multiplicity):
one - one
m
m
many - one
n
29
many - many
© D. Wong 2003
2002
Entity-Relationship Diagrams (continue 1)

Entity
– Has a set of attributes
– The key is underlined

SS#
Key
– Attributes which uniquely identify an entity in
an entity set
– An inherent property of the data (e.g. movie
title and year)
– Serve as a constraint
30
© D. Wong 2003
2002
Entity-Relationship Diagrams (continue 2)

Multiway relationships

Relationships with roles

Attributes on relationships
– E.g. salary of a star for a movie
– Not necessary by adding new entity set, whose
entities have the attributes
31
© D. Wong 2003
2002
E/R Diagrams Multiway relationships

Can be converted to a collection of binary, manyone relationships
– Introduce new entity set (connecting entity set)
Entities
are tuples of the relationship set for the
multiway relationship
– Introduce many-one relationships from the
connecting entity set to each of the entity sets
that provide components of tuples in the
original, multiway relationship
32
© D. Wong 2003
2002
ER Diagrams –Inheritance

Subclasses
– special-case entity sets with own special
attributes and/or relationships

isa relationship : One-One relationship to handle
inheritance, to connect entity set to its subclasses.
– E.g. every Cartoon is a Movie

Super class
ER symbol for isa :
(arrows not drawn)
isa
Subclass
33
© D. Wong 2003
2002
Design Principles

Faithfulness : Reflects the reality of the data and
the problem with the appropriate:
–
attributes
–
data ranges
–
relationships
–
cardinality
E.g.
Stars
Stars
-in
34
Movies
© D. Wong 2003
2002
Design Principles (2)

Avoid Redundancy
– Redundancy : duplicate representation of data,
lead to
Inconsistency,
Waste
violate data integrity
space
E.g.
Studios
Owns
Studio_name
Movies
Studio_name
35
© D. Wong 2003
2002
Design Principles (3)

Keep it simple : Avoid more elements than needed

Pick the right kind of element
– attributes or entity set
– what should the relationships be
36
© D. Wong 2003
2002
DB design


Centers on the entity sets and relationships making up the
database.
Users and designers must work together to:
– Specify the entity sets and relationships and their
meanings
– Specify the attributes and the meaning of it
– Specify the constraints


– Identify key attributes of entity sets
– Specify how the data would be viewed
– Specify who can view and modify what and when
Better to do a good job in the original design
– Need collaboration of users and designers
Reminder: Changing database design later usually very
37
© D. Wong 2003
2002
costly.
Constraints

Constraints are part of the schema, not of an
instance of the database.

Classification of Constraints
– Keys
– Single-value constraints
– Referential integrity constraints
– Domain constraints
– General constrains
38
© D. Wong 2003
2002
Keys in the E/R Model

Key for an entity set E is a set K of one or more attributes
such that, given any two distinct entities e1 and e2 in E, e1
and e2 cannot have identical values for each of the
attributes in the key K. (Ref. 2.3.2)
– E.g. Movie(title, year, length, filmtype), E1 = (titanic,
1960, 120, color), E2 = (titanic, 1997, 180, color)
– A key can consist of more than one attribute
– There can be more than one possible key (candidate
keys), but designate one as the “primary key”.
– Attributes that constitute the primary key cannot be
null
– If the entity set is in an isa-hierarchy, require the root
entity set have all the attributes needed for a key.
– In E/R diagram, attributes of a key are underlined
39
© D. Wong 2003
2002
Single-Value Constraints
 Assertion: at most one value exists in a given
context

Ways to express single-value constraints in E/R:
– Each attribute of an entity set has a single value
Some
attributes may allow null
Key attributes should not be null
No formal representation in E/R diagram, make
a notation beside the attribute
– Many-one relationship
E
R
40
F
© D. Wong 2003
2002
Referential Integrity

Assertion: exactly one value exists in a given role

Enforced at database implementation
– Forbid the deletion of a referenced entity
– Require that if referenced entity is deleted,
then all entities that reference it are deleted as
well.

Use a rounded arrow in the diagram
Movies
Owned
by
41

Studios
© D. Wong 2003
2002
Other kinds of constraints

Domain constraints : restrict the value of an
attribute to be in a limited set.
– No formal notation in E/R, place a notation
next to the attribute
– E.g. declaring the type of an attribute
– E.g specify range for the attribute value

General constraints
– E.g. degree of the relationships
Movies
Stars-in
42
<=10
Stars
© D. Wong 2003
2002
Weak Entity Sets
Definition: An entity set whose key is composed of attributes
some or all of which belong to another entity set.
Causes:
1. entity sets in a hierarchy based on classification
Crew
Unit-of
number
Studios
name
addr
2. Connecting entity sets to eliminate a muliway
relationship.
43
© D. Wong 2003
2002
Weak Entity Sets Requirements
1.
2.
3.
Key consists of zero or more of its own attributes, and
Key attributes from entity sets that are reached by
supporting relationships (many-one)
R is a supporting relationship for a weak entity set E to
some entity set F, R must:
– Be a binary, many-one relationship from E to F
– R must have referential integrity from E to F
– The attributes that F supplies for the key of E must be
key attributes of F
– If F is weak, then key attributes of F supplied to E may
be attributes of some entity set to which F is connected
by a supporting relationship (recursive)
–
For multiple supporting relationship from E to F, each
relationship is used to supply a copy of the key
attributes of F
44
© D. Wong 2003
2002
Week 3
45
© D. Wong 2003
2002
Ch. 3 (part 1)

Relational Model basics

From E/R diagram to Relations
46
© D. Wong 2003
2002
Relational Model

Based on mathematically defined relations of entities

Consists of:
– Attributes (fields)
– Domain legitimate values of attributes (data range)
– Views of data presented in table format
– Create new views by projections of the database
– Records are n-tuples, where n = # of attributes (arity)

Use relational algebra operations to compute data

Result of the operations are also relations
47
© D. Wong 2003
2002
Relations

Sets of tuples presented in a table (order of tuples
immaterial)

Attributes are the column header

Schema of a relation: name and the set of attributes. E.g.
Movies (title, year, length, filmType)

Database schema – the set of schemas for the relations in
a design

Domain – elementary data type of each attribute

Relation instances
Movies title
year
length
filmType
Star Wars
1977
124
color
Mighty Ducks
1991
104
color
Wayne’s World
1992
48
95
color
© D. Wong 2003
2002
E/R Diagram to Relational Designs
1.
Turn each entity set into a relation with the same
set of attributes. E.g. Star (name, address)
2.
Represent relationships by relations
3.
Combining relations when appropriate
4.
Handle weak entity sets
5.
Convert subclass structures to relations
49
© D. Wong 2003
2002
E/R Relationship to Relations
Replace a relationship by a relation (say R):
1. For each entity set involved in relationship R, take its
key attribute(s) as part of the schema of the relation
for R.
2. If the relationship has attributes, add to R’s attribute
set
3. If one entity set has multiple roles in a relationship,
then its key attributes appear as many times as there
are roles. Need to rename the attributes to avoid name
duplication.
e.g. Contracts (starName, title, year, studioOfStar,
producingStudio) Ref. Fig. 3.6
50
© D. Wong 2003
2002
Discovering Keys for Relations
If relation R is constructed from:

Entity set – key attributes of the entity set

Relationship R, key of R is :
–
many-many : composed of keys of both
connected entity sets
–
many-one from E1 to E2 : consisted of key of
E1
–
one-one between E1 and E2: either the key of
E1 or E2
51
© D. Wong 2003
2002
Combining Relations

In general, if 2 or more relations has exactly the
same keys, combine them.

For many-one relationship:
– E (ea1, ea2)
– F (fa1, fa2, fa3)
– R (ea1, fa2, ra1) -- key is from the many relation
– Result: E(ea1, ea2, fa2, ra1)
E
F
R
52
© D. Wong 2003
2002
Handling weak entity set
1.
The relation for the weak entity set includes it’s
own attributes and the attritubes of the other
entity sets that help form it’s key
 e.g. 1: Contracts (starName, studioName, title,
year, salary)
 e.g. 2: Crew (number, studioName)
2.
3.
The relation for any relationship create as usual
(rename attributes to avoid confusion)
Don’t need to create a relation for supporting
relationships
53
© D. Wong 2003
2002
Weak entity set handling example

Studios(name, addr)

Crews(number, studioName)

Unit-of(number, studioName, name) – but not
needed because it’s a supporting relationship
Crew
Uint-of
number
Studios
name
54
addr
© D. Wong 2003
2002
Converting Subclass Structures to Relations
Strategies:
1.
E/R approach
2.
OO approach
3.
Use null values
55
© D. Wong 2003
2002
E/R approach

For each entity set E in the hierarchy, create a
relation that includes the key attributes from the
root and any attributes belonging to E

Do not create a relation for isa relationship

Example:
– Movies(title, year, length, filmType)
– MurderMysteris(title, year, weapon)
– Cartoons(title, year)
56
© D. Wong 2003
2002
OO approach

Treat entities as objects belonging to a single class
– for each possible subtree including the root,
create one relation, whose schema includes all the
attributes of all the entity sets in the subtree

Example:

– Movies(title, year, length, filmType)
– MoviesC(title, year, length, filmType)
– MoviesMM( title, year, length, filmType, weapon)
– MoviesCMM(title, year, length, filmType, weapon)
– Voices(title, year, starName)
Then combine relations that have the same set of
attributes.
57
© D. Wong 2003
2002
Use null values

Create one relation with all the attributes of all the
entity sets in the hierarchy. Each entity is
represented by one tuple, and that tuple has a null
value for whatever attributes the entity does not
have.

Example:
– Movie(title, year, length, filmType, weapon)
58
© D. Wong 2003
2002
Comparison of the 3 approaches

Pros and Cons in each approach. Factors to
consider:
– Number of relations involved in a query
(depends on the query of interest)
– Total number of relations in the schema
– Space usage
59
© D. Wong 2003
2002
Operations in the Relational Data Model

Algebraic notation (Relational Algebra)
sets
bags

Constraints on relations

Extensions to the relational model
60
© D. Wong 2003
2002
Ch. 5 Relational Algebra

Relational Algebra expressions are composed of:
– Operands - relations represented by:
relation’s
name or
the list of tuples in a relation
– Operators

Results of the operations are also relations

Build complex expressions from simple ones

Queries – relational algebra expressions
61
© D. Wong 2003
2002
Classes of Relational Algebra Operations
Classes of relational algebra operations:
1.
Set operations: union, difference, intersection
2.
Remove parts of a relation:
–
Selection – eliminates rows (tuples)
–
Projection – eliminates columns
3.
Combine tuples of two relations : cartesian
product and joins
4.
Renaming: changes relation schema (i.e. relation
name, and/or attribute names)
62
© D. Wong 2003
2002
Example relations 1
R
S
A
B
C
a
b
c
d
a
f
c
b
d
A
B
C
b
g
a
d
a
f
63
© D. Wong 2003
2002
Set Operations

Union, Difference, Intersection

Given relations R and S:
– R and S have schemas with identical set of
attributes, and domain types for each attributes
(=> same arity)
– Columns of R and S must be ordered so that the
order of attributes is the same for both relations
64
© D. Wong 2003
2002
Union

RS
– Result: a set of tuples that are in R or S or both.
– A tuple appears only once in the result even if it is
present in both R and S

– Reminder: R and S has same arity
Example:
A
a
d
c
b
B
b
a
b
g
65
C
c
f
d
a
© D. Wong 2003
2002
Difference

R–S
– Result: set of tuples in R but not in S
– Reminder: Not commutative: R – S ≠ S – R
– Reminder: R and S has same arity

Example:
A
B
C
a
b
c
c
b
d
66
© D. Wong 2003
2002
Intersection

RS
– Result: set of tuples in both R and S, i.e. R-(R-S)
– Reminder: R and S has same arity

Example:
A
B
C
d
a
f
67
© D. Wong 2003
2002
Projection




A1, A2, …, An( R )
Result:
– a new relation that has only the columns for
attributes A1, A2, … An of R
– Schema of the new relation is {A1, A2, … An}
Example 1: A, C( R )
Example 2: B( R )
A
C
a
c
d
f
c
d
B
b
a
68
© D. Wong 2003
2002
Selection
 sc(
R ) , where C is a conditional expression:
– operands are constants or attributes of R
– Operators are : <, =, >, , , , and, or, not

Result:
– a new relation with a subset of R’s tuples that
satisfy condition C
– Schema is the same as R’s schema

Example: sB=b(R)
A
B
C
a
b
c
c
b
d
69
© D. Wong 2003
2002
Cartesian Product

RxS

Let arity of R = k1 and arity of S = k2

Result: the set of all possible (k1 + k2)-tuples whose first k1
components form a tuple in R, and whose last k2
components form a tuple in S

Example:
R.A
R.B
R.C
S.A
S.B
S.C
a
b
c
b
g
a
a
b
c
d
a
f
d
a
f
b
g
a
d
a
f
d
a
f
c
b
d
b
g
a
c
b
d
d
a
f
70
© D. Wong 2003
2002
Example Relation 2
R
A
B
C
1
2
4
7
D
E
3
3
1
5
6
6
2
8
9
71
S
© D. Wong 2003
2002
Theta-Joins

R

Result: tuples in R x S such that C is satisfied, i.e.
C
S
where C is an arbitrary condition
sC(R x S)



To construct the result:
1. Take the product of R and S
2. Select from the product only those tuples that
satisfy the condition C
Arity of result = arity of R + arity of S
Example: R
A
1
1
4
B
2
2
5
C
S where C is B < D
C
3
3
6
72
D
3
6
6
E
1
2
2
© D. Wong 2003
2002
Example Relation 3
R
A
B
C
a
b
d
S
B
C
D
c
b
c
d
b
c
b
c
e
b
b
f
a
d
b
c
a
d
73
© D. Wong 2003
2002
Natural Joins

R
S

Result:
i1, i2, …, im sR.A1=S.A1 …  R.Ak=S.Ak (R x S) where i1, i2, …,
im is the list of all components of R x S in order, except
the components S.A1, … S.Ak

Example:
A
a
B
b
C
c
D
d
a
d
b
b
c
c
e
d
d
c
b
a
c
d
e
b
74
© D. Wong 2003
2002
Renaming

S(A1, A2, …, An)(T) or S(T)

Result: renames relation T to S

Example: R x S(X, Y, Z)(S)
A
a
a
B
b
b
C
c
c
X
b
d
Y
g
a
Z
a
f
d
d
c
a
a
b
f
f
d
b
d
b
g
a
g
a
f
a
c
b
d
d
a
f
75
© D. Wong 2003
2002
Week 4
76
© D. Wong 2003
2002

The rest of Ch. 5 – Relational Algebra

Ch. 10 – Logical query languages
77
© D. Wong 2003
2002
Combining Operations to Form Queries






Construct more complex expressions by applying
operators to sub-expressions
Use parentheses to indicate operands grouping
Multiple ways to write equivalent queries
Expression tree for visualizing complex expression
Query optimizer
Example (ref. Fig. 5.8):
title,year(slength≥100(Movie) sstudioName=‘Fox’(Movie))
or
title,year(slength≥100 AND studioName=‘Fox’(Movie))
78
© D. Wong 2003
2002
Algebraic Laws
R  (S  T) = (R  S)  T

Associative: e.g.

Commutative: e.g. R  S = S  R
79
© D. Wong 2003
2002
Dependent and Independent Operations
1.
Set operations: union, difference, intersection
2.
Remove parts of a relation:
–
Selection – eliminates rows (tuples)
–
Projection – eliminates columns
3.
Combine tuples of two relations : cartesian
product and joins
4.
Renaming: changes relation schema (i.e. relation
name, and/or attribute names)
80
© D. Wong 2003
2002
Constraints on Relations
Two ways to express constraints in relational algebra:
1.
R=Ø
// R  Ø
2.
RS
// R - S = Ø
where R and S are relational algebra expressions
81
© D. Wong 2003
2002
Referential Integrity Constraints

Assertion: a value appearing in one context also
appears in another, related context.

Example:
Movie(title, year, length, inColor, studioName, producerC#)
MovieExec(name, address, cert#, networth)
Constraint: the producer of every movie is a certified movie
executive, i.e. appear in the MovieExec relation
producerC#(Movie)  cert#(MovieExec)
or
producerC#(Movie) - cert#(MovieExec) = Ø
82
© D. Wong 2003
2002
Other Constraints

Domain constraints example:
MovieStars(name, address, gender, birthdate)
Constraint: acceptable values for the “gender” attribute are ‘F’ or
‘M’
sgender’F’ AND gender‘M’(MovieStar) = Ø

Other constraints example:
MovieExec(name, address, cert#, networth)
Studio(name, address, presC#)
Constraint: president of a movie studio must have a net worth of at
least $10,000,000
snetworth<10000000(Studio

presC#=cert#
MovieExec) = Ø
Functional dependency constraints
83
© D. Wong 2003
2002
Relational Operations on Bags



Bag
– a “set” that is allowed to have more than one
occurrence of an element
– => duplicate tuples in a relation
– Constraint representations work with bags
Reason:
– For implementation efficiency when duplication is
acceptable
– When actual no. of tuples is needed for aggregate
Example:
A
B
1
2
3
4
1
2
1
2
84
© D. Wong 2003
2002
Relational Operations of Bags (continue)
Given: R and S are bags, and tuple t appears in R n
times, and in S m times

R  S : contains n + m tuple t

R – S : contains max(0, n-m) tuple t

R  S : contains min(n, m) tuple t

A,B(R) : each tuple is processed independently,
resulting duplicate tuples are not eliminated

sc(R) : apply the selection condition to each tuple
independently, resulting duplicate tuples are not
eliminated
85
© D. Wong 2003
2002
Product and Joins of Bags
Given: R and S are bags, and tuple r appears in R m
times, and tuple s appears in S n times

R x S : the resulting tuple rs will appear mn times.

R S : each tuple of R is compared to each tuple
of S to decide if the pair tuples joins successfully,
do not eliminate duplicates

R CS : each tuple of R is compared to each tuple
of S to decide if the condition C is met, do not
eliminate duplicates
86
© D. Wong 2003
2002
Extended Operation to Relational Algebra

Duplicate elimination : to convert a bag to a set

Aggregation: count, sum, max, min, average

Grouping

Extended Projection

Sorting

Outerjoins
87
© D. Wong 2003
2002
OUTERJOINS






Dangling tuples: tuples that failed to match any tuple of
the other relation in the common attributes.
An operator to augment the result of a join by the
dangling tuples, padded with null values.
R
S : Full outerjoin of R1 and R2 is a join that
includes all rows from R1 and R2 matched or not.
Unmatched rows are padded with special null symbols .
LEFT outerjoin of R1 and R2 is a join that includes all
rows from R1, matched or not, plus the matching values
from R2. Unmatched rows are padded with
.
RIGHT outerjoin of R1 and R2 is a join that includes all
rows from R2, matched or not, plus the matching values
from R1. Unmatched rows are padded with
.
The joining may be NATURAL or theta join
88
© D. Wong 2003
2002
Outer Join Example
R
A
B
C
a
b
d
S
B
C
D
c
b
c
d
b
c
b
c
e
b
b
f
a
d
b
c
a
d
c
d
b
Full Outer Join
Natural join
A
B
C
D
A
B
C
D
a
b
c
d
a
b
c
d
a
b
c
e
a
b
c
e
d
b
c
d
d
b
c
d
d
b
c
e
d
b
c
e
b
b
f
c
a
d
b
c
a
d
b
c
d
b
89
© D. Wong 2003
2002
Extensions to the Relational Model

Modifications : insert, delete, update

Views : relational expression with a name to be
applied real relations to produce the relation
defined by the expression. Views can used as
arguments to other expressions.

Null values : common interpretations:
– Value unknown
– Value inapplicable
– Value withheld
90
© D. Wong 2003
2002
Ch. 10 Logical Query Languages

Motivation

Datalog

Relational Algebra to Datalog

Recursion in Datalog

Negation in Recursive Rules
91
© D. Wong 2003
2002
Motivation

Logical rules is more natural in representing
recursive queries

Logical rules form the basis of many informationintegration applications
92
© D. Wong 2003
2002
A Datalog rule example

Relation:
Movie(title, year, length, inColor, studioName,
producerC#)
LongMovie(t, y)  Movie(t, y, l, c, s, p) AND l  100
head
subgoals
body
LongMovie = title, year(slength≥100(Movie))
93
© D. Wong 2003
2002
Datalog rule







Relational Atoms : predicate followed by arguments
Arithmetic Atoms : comparison between two
arithmetic expressions (e.g. x ≠ Y)
Predicate = relation name or arithmetic comparison
predicates (e.g. =, <, ≠, etc)
Head – a relational atom
Body – one or more atoms (subgoals) connected by
AND
Subgoals (not head) may be optionally negated by
NOT
Local variables – variables in body, not in head
94
© D. Wong 2003
2002
Datalog

A logic based data model

The underlying mathematical model of data is
essentially that of the relational model

Predicate symbols denote relations

Relational algebra operations are described by
rules
 Query

: a collection of one or more rules.
The relation in the rule head is the answer to the
query
95
© D. Wong 2003
2002
Extensional and Intensional Predicates



Extensional Predicates (EDB)
The set of relations which ARE defined as part of the
actual database (i.e. physically stored).
e.g. R = {1}
Intensional Predicates (IDB)
The set of relations which are NOT defined as part of the
actual database but are instead abstracted from logical
rules.
e.g. P (x)  Q (x)
Q (x)  R (x)
A predicate must be IDB or EDB but not both.
–IDB predicate can appear in the body or head of a rule
–EDB predicate can appear in the rule body only
96
© D. Wong 2003
2002
3 Different Interpretations of Logical Rules

Proof-Theoretic Interpretaton

Model-Theoretic Interpretation

Computational Interpretation
The following discussions will use the EDB:
R = {1}
97
© D. Wong 2003
2002
Proof-Theoretic Interpretation
As axioms to be used in a proof.
–From the facts in the database, see what other
facts can be proved using the rules in all possible
ways.
–All facts derivable using the rules are derivable
by applying the rules in the forward direction
only
–Example:
P = {1}, Q = {1}
98
© D. Wong 2003
2002
Model-Theoretic Interpretation
As definition of possible worlds or models.
– To be a model, an interpretation must make the rules
true, no matter what assignment of values is made for
the variables in each rule.
– Multiple models are possible
– With no negations, a unique minimal model exists
that gives the same result as the proof-theoretic
interpretation.
– Minimal model : cannot make any true fact false and
still have a model consistent with the EDB
– Example:
1.
P = {1, 2, 3}, Q = {1, 2}
2.
P = {1}, Q={1}
99
© D. Wong 2003
2002
Computational Interpretation
By providing an algorithm for “executing” the rules
to determine whether a potential fact is true or false.
– E.g. Prolog – uses a particular algorithm that
involves searching for proofs of the potential
fact.
– Drawback:
the set of facts Prolog finds a proof is not
necessarily the same as the set of all facts for
which a proof exists
The set of facts Prolog finds true is not
necessarily a model.
100
© D. Wong 2003
2002
Week 5
101
© D. Wong 2003
2002
Example
1.
2.
3.
sibling(X, Y)  parent(X, Z) AND parent(Y, Z) AND X ≠ Y.
cousin(X, Y)  parent (X, Xp) AND parent (Y, Yp) AND
sibling(Xp, Yp).
cousin(X, Y)  parent (X, Xp) AND parent (Y, Yp) AND
cousin((Xp, Yp).
4.
related(X, Y)  sibling(X, Y).
5.
related(X, Y)  related(X, Z) AND parent(Y, Z).
6.
related(X, Y)  related(Z, Y) AND parent(X, Z).
102
© D. Wong 2003
2002
Meaning of Logical Rules

Head is true of its arguments if there exist values
for local variables that make all the subgoals true

If rule contains only non-negated relational
predicates, then value of head variables is the
result of natural join the subgoals and project onto
the head variables

Example:
Cousin(X, Y) =  X,Y(P(X, Xp)
Yp))
103
P(Y, Yp)
S(Xp,
© D. Wong 2003
2002
Evaluation of Rules – Variable based

Consider all possible assignments of values to the
variables. If all subgoals are true, add the head to the
result relation.

Example:
–
Only assignments that make the first subgoal true:
Case 1: x = 1, z = 2
Case 2: x = 2, z = 3
–
In case 1, y = 3 makes second subgoal true. Since (1, 3) is
not in R, so the third subgoal is also true.

So, add (x, y) = (1, 3) to S
–
In case 2, no value of y makes the second subgoal true
–
So, S = {(1, 3)}
104
© D. Wong 2003
2002
Evaluation of Rules – Tuple based

Consider all assignments of tuples to subgoals that make
each subgoal true. If the variables are assigned consistent
values, add the head to the result.

Start with non-negated relational subgoals only
Example:
– Start with R(x,z) and R(z, y). Possible tuple assignments
to subgoals:
R(x, z)
R(z, y)

– Only 1 assignment gives
consistent value to z
(1, 2)
(1, 2)
(1, 2)
(2, 3)
– It also makes the third
subgoal true
– So, S = {(1, 3)}
(2, 3)
(1, 2)
(2, 3)
(2, 3)
105
© D. Wong 2003
2002
Safe rules

A rule is safe if all of its variables are limited.

Variable, X, in a rule is limited if:
1. Any variable that appears as an argument in a
relational predicate of the body is limited
2. Any variable that appears in a subgoal X=a,
or a=X, where a is a constant, is limited
3.Variable x is limited if it appears in a subgoal
X=Y or Y=X, when Y is a variable already
known to be limited

Rules must be safe to make sense (i.e. to produce
finite relation as the result)
106
© D. Wong 2003
2002
Safe and unsafe rule Examples
1.
biggerThan(X, Y)  X > Y.
unsafe
2.
S(X)  R(Y).
unsafe
3.
S(X)  R(Y) AND X < Y
unsafe
4.
P(X, Y)  Q(X, Z) & W=a & Y=W.
safe
107
© D. Wong 2003
2002
Datalog Rules and Relational Algebra summary

Intersection : AND

Union: multiple rules with same IDB predicate as
rule head

Difference: NOT

Projection: rule head atom has arguments of the
projected attributes

Selection: condition represented by arithmetic
subgoals

Product: AND two subgoals, with head atom
contain all attributes of the two subgoals

Joins
108
© D. Wong 2003
2002
A Datalog rule example

Relation: parent(C,P), where P is a parent of C
sibling(X, Y)  parent(X, Z) AND parent(Y, Z) AND X ≠ Y.
head
subgoals
body
S(X,Y) = X,Y(sX≠Y(P(X, Z)
109
P(Y, Z)))
© D. Wong 2003
2002
Recursion

P Q
P depends on Q

Draw a graph: node = IDB predicates, arc P -> Q
means P depends on Q

Cycles means recursion

Example:
cousin(X, Y)  parent (X, Xp) AND parent (Y, Yp) AND
cousin((Xp, Yp).
110
© D. Wong 2003
2002
Dependency Graph
related
cousin
sibling
parent
111
© D. Wong 2003
2002
Least Fixedpoint

The smallest IDB relations that contain all tuples
that the rules require us to follow

Iterative Fixed-point Evaluation of recursive
rules:
Start
IDB = 
yes
Apply rules
to IDB,
EDB
Changes
to IDB?
112
No
Done
© D. Wong 2003
2002
Fixedpoint Computation Example (Ref. Text)
1. FollowOn(x, y)  SequelOf(x, y)
2. FollowOn(x, y)  SequelOf(x, z) AND FollowOn(z, y)
Evaluation note:
At round i, the only new tuples added to any IDB
relation comes from match to tuples added in
the round i-1
113
© D. Wong 2003
2002
Negation in Recursive Rules

We can compute a minimal fixedpoint only for
stratifiable recursion with negation.

To determine stratifiability:
1. Draw a graph whose nodes are the IDB
predicates
2. Draw an arc (labeled -) from node A to node B
for rule A  not B
3. Draw an arc (no label) from node A to node B
for rule A  B
114
© D. Wong 2003
2002
Negation in Recursive Rules (continued)

If graph has a cycle containing one or more negative
arcs, the the recursion is NOT stratified. (When
negation appears in the least fixedpoint operation.)

Stratum of IDB predicate A = the largest no. of
negative arcs on a path beginning from A.

If no negative arcs on any paths from predicate A,
stratum number of A = 0

Evaluate IDB predicates in the order of their strata,
starting with the lowest.
115
© D. Wong 2003
2002
Week 6
116
© D. Wong 2003
2002
Ch. 3 (continued)






Database design problems
Functional Dependency
Keys of relations
Decompositions based on Functional Dependency
(normalization)
– Boyce-Codd Normal Form (BCNF)
– Third Normal Form
– Recovering information from a decomposition
Multivalued Dependencies and decomposition
– Fourth Normal Form
Relationships Among Normal Forms
117
© D. Wong 2003
2002
Database Design Problems

Principal kinds of anomalies that constitute bad
db design:
– Redundancy
– Update Anomalies
– Deletion Anomalies
– Insertion Anomalies

Want to find a good relation schema design for the
relational model
118
© D. Wong 2003
2002
Redundancy

Information repeated unnecessarily in several
tuples.

Example:
– length and filmType for Movie
119
© D. Wong 2003
2002
Update Anomalies

Change information in one tuple but leave the
same information unchanged in another

A consequence of redundancy

Cause potential inconsistency

Example:
– Change length of Star Wars in one tuple but not
the others
120
© D. Wong 2003
2002
Deletion Anomalies

A set of values becomes empty (deleted), may lose
other information as side effect

Example:
– Deleting the only star listed for the movie
Mighty Ducks
121
© D. Wong 2003
2002
Insertion Anomalies

Cannot insert a tuple because some of the data not yet
available

Inverse to deletion anomalies

Problem of using null value to fill the missing / unavailable
data:
– When the data becomes available, will we remember to
delete the one with nulls
– If the missing data is part of a key, then can’t use null

Example:
– Cannot start to keep track of the information of a new
movie when the cast is not yet determined
122
© D. Wong 2003
2002
Dependencies

Dependency is an assertion that only a subset of all
possible relations are ‘legal’. An assertion about the real
world, cannot be proved.


It’s a form of constraints.
Dependencies in a relation means some sort of redundancy
in the legal relations.

Example: title, year  length
title


year
length
filmType
studioName
starName
Star wars 1977
124
Color
Fox
CarrieFisher
Star wars 1977
???
Color
Fox
Mark Hamil
Functional dependencies (FD)
Multivalued dependencies (MVD)
123
© D. Wong 2003
2002
Functional Dependencies (FD)






Given: relation schema R(A1, …, An), and X and Y be
subsets of (A1, … An).
FD : X  Y means X functionally determines Y
e.g. A1A2…An  B1B2…Bm
X  Y is an assertion about R that whenever two tuples
agree on all the attributes of X, then they must also agree
on attributes of Y.
It’s a constraint on the data that may appear within a
relation. (i.e. schema level control of data)
It’s a restriction on relations that depend only on the
equality or inequality of values, i.e. value-oblivious
It’s the most important type of value-oblivious constraints.
Value-oblivious constraints have the greatest impact for
designing database schemas
124
© D. Wong 2003
2002
Significance of Functional Dependencies

Key:
1. Entity set R (A1, …, An), and X is a subset of
A1, …, An that form a key for R, then may
assert:
X  Y where Y is any subset of {A1, …, An}
2. R(A1, …, An) is a many-one relationship from
entity set E1 to entity set E2, and among the
Ai’s are attributes that form a key X for E1
and a key Y for E2, then may assert: X  Y

Tool for explaining the process of normalization
125
© D. Wong 2003
2002
Representing FD using Relational Algebra

Example 5.31
MovieStar (name, address, gender, birthdate)
FD: name  address
sMS1.name=MS2.name AND MS1.address MS2.address
(MS1(MovieStar) x MS2(MovieStar)) = 
126
© D. Wong 2003
2002
Keys in the E/R Model

key for an entity set E is a set K of one or more attributes
such that, given any two distinct entities e1 and e2 in E, e1
and e2 cannot have identical values for each of the
attributes in the key K.
– A key can consist of more than one attribute
– There can be more than one possible key (candidate
keys), but designate one as the “primary key”.
– Attributes that constitute the primary key cannot be
null
– If the entity set is in an isa-hierarchy, require the root
entity set have all the attributes needed for a key.
– In E/R diagram, attributes of a key are underlined
127
© D. Wong 2003
2002
Keys of Relations
K is a key for relation R if:
1. K  all attributes of R
2. For no proper subset of K is (1) true (i.e. a key
must be minimal)
3. If K at least satisfies (1), then K is a superkey
Example: {title, year, starName) forms a key for the
Movie relation
Superkeys:
A set of attributes that contains a key is called a
superkey
128
© D. Wong 2003
2002
Discovering Keys for relations
Consider relation R:
1.
If R comes from an entity set, then the key for the
relation is the key attributes of this entity set
2.
If R is from a many-many relationship, then the keys of
both connected entity sets are the key attributes for R
3.
If R is from a many-one relationship from entity set E1 to
entity set E2, then the keys attributes of E1 are the key
attributes for R
4.
If R is from a one-one relationship, then the key
attributes for either of the connected entity sets are key
attributes of R. More than 1 candidate key in this case.
129
© D. Wong 2003
2002
FD Rules

Splitting / Combining rule

Trivial Dependencies

Armstrong’s Axioms:
1. Reflexivity
2. Augmentatioin
3. Transitivity

Computing closure of attributes

Finding all implied FD’s ( Ref. section: 3.5.7)
130
© D. Wong 2003
2002
Splitting / Combining rule

FD:
A1A2…An  B1B2…Bm
I
vs.
A1A2…An  B1
II
...
A1A2…An  Bm

Splitting rule: replace FD I by the set of FDs II

Combining rule: replace the set of FDs II by FD I
131
© D. Wong 2003
2002
Trivial Dependencies

FD A1A2…An  B is trivial if B is one of the A’s

Every trivial dependency holds in every relation

For FD A1A2…An  B1B2…Bm
– Trivial if the B’s are a subset of the A’s
e.g. title year  title
– Nontrivial if at least one of the B’s is not among the A’s
– Completely nontrivial if none of the B’s is also one of
the A’s

Trivial-dependency rule:
– A1A2…An  B1B2…Bm  A1A2…An  C1C2…Ck where
the C’s are all those B’s that are not also A’s
132
© D. Wong 2003
2002
Armstrong’s Axioms:
A set of inference rules: (Pg. 135, 2nd ed. Pg.99)
1.
Reflexivity
If {B1, B2, …, Bm}  {A1, A2, …, An}, then
A1A2…An  B1B2…Bm. // trivial dependencies
2.
Augmentatioin
If A1A2…An  B1B2…Bm, then
A1A2…AnC1…Ck  B1B2…BmC1…Ck for any set of
attributes C1…Ck
3.
Transitivity
If A1A2…An  B1B2…Bm and B1B2…Bm  C1…Ck then
A1A2…An  C1…Ck
133
© D. Wong 2003
2002
Week 7
134
© D. Wong 2003
2002
Functional Dependencies (FD)


Given: relation schema R(A1, …, An), and X and Y be
subsets of (A1, … An).
FD : X  Y means X functionally determines Y
e.g. A1A2…An  B1B2…Bm
Significance:
– Key:
1. Entity set R (A1, …, An), and X is a subset of A1, …, An
that form a key for R, then may assert:
X  Y where Y is any subset of {A1, …, An}
2. R(A1, …, An) is a many-one relationship from entity set
E1 to entity set E2, and among the Ai’s are attributes
that form a key X for E1 and a key Y for E2, then may
assert: X  Y
135
© D. Wong 2003
2002
Keys of Relations
K is a key for relation R if:
1. K  all attributes of R
2. For no proper subset of K is (1) true (i.e. a key
must be minimal)
3. If K at least satisfies (1), then K is a superkey
Example: {title, year, starName) forms a key for the
Movie relation
Superkeys:
A set of attributes that contains a key is called a
superkey
136
© D. Wong 2003
2002
FD Rules

Splitting / Combining rule

Trivial Dependencies

Armstrong’s Axioms:
1. Reflexivity
2. Augmentatioin
3. Transitivity

Computing closure of attributes

Finding all implied FD’s ( Ref. section: 3.5.7)
137
© D. Wong 2003
2002
Given vs. Derived FD’s

Given FD’s: stated initially for a relation. FD’s
These are known to hold for a relation R.

Derived FD’s are FD’s that follow logically from
the given FD’s (using the inference rules) or by the
closure algorithm of attributes.

Basis : any set of given FD’s from which all FD’s
for a relation can be inferred.

Minimal basis: if no proper subset of the FD’s in a
basis can also derive the complete set of FD’s.

Example: 3.22 – pp. 98
138
© D. Wong 2003
2002
Closure of attributes

Let {A1,A2,…,An} is a set of attributes, S is a set of
FD’s

Closure of {A1,A2,…,An} under the dependencies in
S = set of attributes B such that every relation that
satisfies all the dependencies in set S also satisfies
A1A2…An  B. I.e. B is the set of attributes
functionally determined by {A1,A2,…,An}

Notation: {A1,A2,…,An}+

Allow trivial dependencies, so A1A2…An are
always in {A1,A2,…,An}+
139
© D. Wong 2003
2002
Closure Algorithm
Let X be {A1,A2,…,An}+
1. X = {A1,A2,…,An}
// initialize X
2. Search for some FD B1B2…Bm  C where
B1B2…Bm  X, but C is not in X. If found, add
C to X
3. Repeat step 2 until no more attributes can be
added to X.
4. The resulting X is the correct value of
{A1,A2,…,An}+
140
© D. Wong 2003
2002
Attributes Closure Example. 3.19 pp. 93
Given: R(A, B, C, D, E, F) and FD’s
AB  C
BC  AD
DE
CF  B
To find: {A, B}+
Procedure:
1. Let X = {A, B}
2. Add C to X
3. Add D to X
// AB  C,  X = {A, B, C}
// BC  AD,  X = {A, B, C, D}
4. Add E to X
// D  E,  X = {A, B, C, D, E}
5. End, {A, B}+ = {A, B, C, D, E}
141
© D. Wong 2003
2002
To infer if a FD follows a set of FD in a relation

AB  D follows from dependencies in example 3.19?
– Compute {A, B}+ , if D ends up in the closure,
then AB  D follows from the dependencies
– Since {A, B}+ = {A, B, C, D, E},  AB  D follows.

D  A follows from dependencies in example 3.19?
– Compute {D}+ , if A ends up in the closure, then
D  A follows from the dependencies
– But {D}+ = {D, E},  D  A does NOT follows.
142
© D. Wong 2003
2002
Attributes Closures and Keys
To test if A1,A2,…,An is a key for a relation R:

Compute {A1,A2,…,An}+
1. Check if {A1,A2,…,An}+ is the set of all
attributes of R
2. If yes, then check no subset of A1,A2,…,An, say
S, such that S+ is the set of all attributes of R

If A1,A2,…,An only satisfies 1 but not 2, then
A1,A2,…,An is a superkey
143
© D. Wong 2003
2002
Finding all implied FD’s




Motivation: Given relation R with FDs F. When
projecting R to form new relation S, want to know what
FD’s hold in S
Method: compute all FD’s that
1. Follow from F
2. Involve only attributes of S
Problem: no. of FDs may be large (could be exponential
in the number of attributes of S)
So, make simplifications in computing attribute closure:
– Ignore empty set and set of all attributes (trivial FD’s)
–
–
Drop XY  A if X  A holds.
If closure of some set X is all attributes, then ignore
supersets of X.
144
© D. Wong 2003
2002
Algorithm to find all implied FD’s
1.
2.
3.
For each set of attributes X compute X+. Start
with closures of singleton set, and then move onto
doubleton if necessary.
Add X  A for each A in X+ - X
Ignore “obvious” dependencies that follow from
others as described in the simplification
guidelines.
145
© D. Wong 2003
2002
Finding all implied FD’s Example

Given: R(ABCD), and FD’s F: AB  C, C  D, D  A

What FD’s follow from F?
•
A+ = A; B+ = B
// nothing
•
C+ = ACD
// add C  A
•
D+ = DA
// nothing new
•
(AB)+ = ABCD
// add AB  D; skip all supersets of AB
•
(BC)+ = ABCD
// nothing new; skip all supersets of BC
•
(BD)+ = ABCD
// add BD  C; skip all supersets of BD
•
(AC)+ = ACD ; (AD)+ = (AD); (CD)+ = ACD // nothing new
•
(ACD)+ = ACD
•
All other sets contain AB, BC, or BD, so skip
•
 the only interesting FD’s that follow from F are :
// nothing new;
C  A, AB  D, BD  C
146
© D. Wong 2003
2002
Normalization

Purpose: process to eliminate redundancy in
relations due to functional or multi-valued
dependencies.

Normal forms:
– Boyce-Codd Normal Form (BCNF)
– Third Normal Form (3NF)
– Fourth Normal Form (4NF)
147
© D. Wong 2003
2002
Boyce-Codd Normal Form (BCNF)




Definition:
A relation R is in BCNF iff whenever there is a nontrivial
dependency A1A2…An  B holds for R, {A1,A2,…,An}
must be a superkey for R
Alternative definition after applying the combining rule to
all FD’s with a common L.S.:
A relation R is in BCNF iff whenever nontrivial
dependency A1A2…An  B1B2…Bm holds for R,
{A1,A2,…,An} must be a superkey for R
Guarantees no redundancy, prevents update, deletion, and
insertion anomalies
E.g. title year  length filmType studioName // violates
BCNF because key of Movie is {title, year, starName}, so
Movie is not in BCNF
148
© D. Wong 2003
2002
Decomposition to BCNF
Given: relation R, FD’s F.
Decomposition strategy:
1. look for a non-trivial FD A1A2…An  B1B2…Bm that
violates BCNF. (Heuristic: add to the RS as many
attributes as are functionally determined by
A1A2…An)
2. Compute {A1,A2,…,An}+ // all the attributes involved
in the violating dependency
Others
3. Decompose R into
1.
2.
A’s
{A1,A2,…,An}+ , and
(R - {A1,A2,…,An}+)  {A1,A2,…,An}
B’s
// LS + all
attributes not involved in the violating dependency
149
© D. Wong 2003
2002
BCNF Decomposition Example 3.24 pp 104

Relation: Movie(title, year, length, filmType, studioName,
starName)

Key: {title, year, starName}


FD’s: title year  length filmType studioName is a BCNF
violation, so Movie not in BCNF
Decomposition:
Schema 1: {title, year, length, filmType, studioName}
Schema 2: {title, year, starName}

To obtain the new relations, project the schemas onto Movie

To recover information (I.e. Movie) from the new relations:
natural join the new relations. Does not lose information.
150
© D. Wong 2003
2002
Additional points about BCNF

A relation may have more than 1 keys

Need some key in the LHS of any nontrivial FD

Only nontrivial FD’s are potential BCNF violation
candidates

Any 2-attributes relations is in BCNF

Decomposition must be based on FD that holds in
the relation, otherwise can’t recover the original
relation from the new relations.

Eliminates all redundancies

Some decomposition may not preserve the FD’s
(e.g. 3.32 pp 114)
151
© D. Wong 2003
2002
Third Normal Form (3NF)






Definition:
A relation R is in 3NF if: whenever there is a nontrivial
dependency A1A2…An  B holds for R, either
{A1,A2,…,An} is a superkey for R, or B is a member of
some key (i.e. B is prime).
BCNF with a relaxed condition
Some redundancy might be left in 3NF if resulting relations
are not in BCNF
Preserves all the FD’s
Process similar to BCNF with the added condition
To recover information (I.e. Movie) from the new relations:
natural join the new relations. Does not lose information.
152
© D. Wong 2003
2002
Week 9
153
© D. Wong 2003
2002
Normalization

Purpose: process to eliminate redundancy in
relations due to functional or multi-valued
dependencies.

Decompose relation schema into Normal forms:
– Boyce-Codd Normal Form (BCNF)
– Third Normal Form (3NF)
– Fourth Normal Form (4NF)

To obtain the new relations, project the schemas
onto the original relation schema (e.g. Movie)

To recover information (I.e. Movie) from the new
relations: natural join the new relations.
154
© D. Wong 2003
2002
BCNF Decomposition Example 3.24 pp 104

Relation: Movie(title, year, length, filmType, studioName,
starName)

Key: {title, year, starName}


FD’s: title year  length filmType studioName is a BCNF
violation, so Movie not in BCNF
Decomposition:
Schema 1: {title, year, length, filmType, studioName}
Schema 2: {title, year, starName}

To obtain the new relations, project the schemas onto Movie

To recover information (I.e. Movie) from the new relations:
natural join the new relations. Does not lose information.
155
© D. Wong 2003
2002
Functional Dependencies (FD)

Given: relation schema R(A1, …, An), and X and
Y be subsets of (A1, … An).
FD : X  Y means X functionally determines Y
e.g. A1A2…An  B1B2…Bm

A1A2…An  B1B2…Bm is an assertion about R
that two attributes or sets of attributes in R are
dependent of one another.
156
© D. Wong 2003
2002
Mutivalued Dependencies (MVD)
relation schema R, and A1A2…An and B1B2…Bm be
subsets of attributes of R.
 Given:
MVD : A1A2…An  B1B2…Bm holds in R if :
For each pair of tuples t and u of relation R that agree on
all the A’s, we can find in R some tuple v that agrees:
1. With both t and u on the A’s,
2. With t on the B’s, and
3. With u on all attributes of R that are not among the
A’s or B’s
 A1A2…An  B1B2…Bm is an assertion about R that two
attributes or sets of attributes in R are independent of one
another.
 Cause redundancy not related to FD’s in a BCNF schema.
 Most common source: putting 2 or more many-many
relationships in a single relation.
157
© D. Wong 2003
2002
MVD Rules

Trivial dependencies rule
If A1A2…An  B1B2…Bm holds for R, then A1A2…An
 C1C2…Ck holds where the C’s are the B’s + one or
more of the A’s. The converse also hold.

Transitive rule
If A1A2…An  B1B2…Bm and B1B2…Bm  C1C2…Ck
then A1A2…An  C1C2…Ck

Splitting rule does not hold
E.g. name  street city, but not name  street
So, always start with set of attributes on the R.S. because
splitting rule does not hold.
158
© D. Wong 2003
2002
More MVD Rules

Every FD is an MVD
Because If FD A1A2…An  B1B2…Bm, then swapping B’s
between tuples that agree on A’s doesn’t create new tuples.
A’s
B’s
t
u

Complementation rule
If X  Y, then X  Z, where Z is all attributes not in
X or Y
e.g. Star_Star_In {name, street, city, title, year}
name  street city
name  title year
159
© D. Wong 2003
2002
Nontrivial MVD
A1A2…An  B1B2…Bm for a relation R is
nontrivial if:
1.
B1B2…Bm is not a subset of A1A2…An
2.
A1A2…An  B1B2…Bm is not all attributes of R
160
© D. Wong 2003
2002
Fourth Normal Form (4NF)

Decompose relations that has MVD’s into 4NF to
eliminate MVD’s.

Definition:
R is in 4NF if A1A2…An  B1B2…Bm is a
nontrivial MVD, {A1A2…An} is a superkey.

Since every FD is an MVD, so 4NF is more
stringent than BCNF

Only nontrivial MVD’s has the potential to violate
4NF
161
© D. Wong 2003
2002
4NF Decomposition
Given: relation R, and nontrivial MVD X  Y that violate
4NF
1.
Decompose X  Y into XY and X  (R-Y)
R
X
Y
2.
Produce the relations by projecting R onto XY and
 (R-Y)
X
3.
Reconstruct R from the new relations using natural join
e.g. Star_Star_In {name, street, city, title, year} and
name  street city
Decompose Star_Star_In using name  street city into
{name, street, city} and {name, title, year}
162
© D. Wong 2003
2002
Relationships among normal forms

4NF is the most stringent

4NF  BCNF  3NF
163
© D. Wong 2003
2002
Lossless-join decomposition
Given: Relation R, decomposed into schemes R1, R2, …
Rk, and D is a set of dependencies.
Definition: R1, R2, … Rk is a lossless-join (w.r.t. D) if for
every relation r for R satisfying D:
r = R1(r)
R2(r)
…
Rk(r)
i.e. Every relation r for R is the natural join of
its projections onto the Ri’s.
The lossless-join property is necessary if the decomposed
relation is to be recoverable from its
decomposition.
However, joins are expensive. So, don’t over decompose!
164
© D. Wong 2003
2002
Structured Query Language (SQL)

A DDL and DML for relational DBMSs

History: ANSI SQL, , SQL-92 (SQL2), SQL-99 (SQL3)

SQL-99 extends SQL2 with object-relational features and
other new features

Most DBMS vendors implements the core, and then add
bells and whistles and variations

Query capability is close to relational algebra, with lots of
extensions.

Case insensitive except characters inside quoted strings ' '
e.g. 'Smith'  'SMITH'

; as statement delimiter
165
© D. Wong 2003
2002
Example database schema
Movie(title, year, length, inColor, studioName, producerC#)
StartIn(movieTitle, movieYear, starName)
MovieStar(name, address, gender, birthdate)
MovieExec(name, address, cert#, netWorth)
Studio(name, address, presC#)
166
© D. Wong 2003
2002
SQL Quries – basic form
SELECT attribute/s
FROM relations / views /subqury
WHERE conditional expression;
167
© D. Wong 2003
2002
SQL query examples
1.
Example 1:
SELECT *
FROM Movie;
2.
-- * => all attributes of Movie
Example 2:
SELECT *
FROM Movie
WHERE studioName = 'Disney' AND year = 1990;
3.
Example 3:
SELECT title, length
FROM Movie
WHERE studioName = 'Disney' AND year = 1990;
168
© D. Wong 2003
2002
Duplicates

SQL generally operates using bags instead of sets

Exception: UNION, INTERSECT, EXCEPT
operation

To eliminate duplicates, add keyword DISTINCT
to the SELECT clause
e.g. SELECT DISTINCT starName
FROM StarsIn;

Duplicate elimination is costly. Use judiciously.
169
© D. Wong 2003
2002
SQL Correspondence to Relational Algebra
SELECT L
--  R.A. project
FROM R
--  R.A. operands
WHERE C ;
--  R.A. select
R.A. expression: L(sC(R))
When reading and writing queries:
1.
FROM
-- what relations are involved
2.
WHERE
-- what's the tuples selection criteria
3.
SELECT
-- what columns to output
170
© D. Wong 2003
2002
Union, Intersection, Difference of Queries

UNION : R1 UNION R2
or (Q1) UNION (Q2)
e.g. (SELECT title, year FROM Movie)
UNION
(SELECT movieTitle AS title, movieYear AS
year FROM StarsIn);

INTERSECT : R1 INTERSECT R2 or
(Q1) INTERSECT (Q2)

EXCEPT: R1 EXCEPT R2
-- difference
(Q1) EXCEPT (Q2)
171
© D. Wong 2003
2002
Union, Intersection, Difference of Queries (continued)

Q1 and Q2 are queries that produce relations

R1 and R2, or results of Q1 and Q2 should have
the same list of attributes and attribute types.
Rename if necessary.

Duplicates are eliminated automatically

Add the keyword ALL after UNION,
INTERSECT, or EXCEPT to prevent duplicates
elimination
172
© D. Wong 2003
2002
SQL and Relational Algebra

The six independent operations are implemented
by SQL

SQL is relational complete
173
© D. Wong 2003
2002
Some data values in SQL
1.
Strings
2.
Dates and Times
3.
Null values
4.
Truth value of Unknown
174
© D. Wong 2003
2002
1. Strings

Comparison operators (according to lexicographical order)
<, >, <=, >= =

LIKE -- pattern matching
% -- matches any sequence of 0 or more characters
_ -- matches any one character
E.g.: title LIKE 'Star _ _ _ _'
E.g.: title LIKE '%''s%'
Can specify escape character
 E.g.
title LIKE 'x%%x%' ESCAPE 'x'
175
© D. Wong 2003
2002
2. Dates and Times

Date constant: DATE '2002-10-01'

Time constant: TIME '15:00:02.5'

Timestamp (combines dates and times):
TIMESTAMP '2002-10-01 15:00:02.5‘
(beware of implementation differences!)

Comparison operators apply
176
© D. Wong 2003
2002
3. Null Values





NULL to represent:
1. Value unknown
2. Value inapplicable
3. Value withheld
Operations involving NULL
1. Arithmetic operation: result is NULL
2. Comparison: result is UNKNOWN
NULL is not a constant, therefore NULL cannot be used
explicitly as an operand.
IS NULL and IS NOT NULL checks
Read "Pitfalls Regarding Nulls" pp. 250
177
© D. Wong 2003
2002
4. UNKNOWN

Consider TRUE = 1, FALSE = 0, UNKNOWN =
0.5
1. AND of 2 truth-value = min. of the 2 values
2. OR of 2 truth-value = max. of the 2 values
3. Negation of v = 1-v

Refer Fig. 6.2 pp. 250 for truth table for 3-valued
logic
178
© D. Wong 2003
2002
The Six Clauses in SQL Queries
1.
SELECT
-- required
2.
FROM
-- required
3.
WHERE
4.
GROUP BY
5.
HAVING
6.
ORDER BY

Subqueries may appear in the FROM clause and the
WHERE clause

Comments begins with ‘--’
-- if used, must follows a group by clause
179
© D. Wong 2003
2002
Table level SQL (ref. 6.6, pp. 292)

Create table – to define the schema of a base table
(Ref. 6.6.1 for data types syntax)
E.g.
create table EMP
(
empno int not null,
lastName varchar(30) not null,
firstName varchar(30) not null,
num_of_children int,
constraint pk_EMP primary key (empno)
);

Drop table – to destroy a base table
e.g. drop table EMP;
180
© D. Wong 2003
2002
Tuple Modification Statements (ref. 6.5, pp. 286)

Insert – to add a row
Syntax: insert into R(A1..An) values (v1…vn)
– E.g. insert into emp(empno, lastName, firstName,
num_of_children) values (12345, ‘Doe’, ‘John’, 1)
– Or insert into emp values (12345, ‘Doe’, ‘John’, 1)


Delete – to remove a row
Syntax: delete from R where <condition>
– E.g. delete from emp where empno = 12345
Update – to modify the contents of a row
Syntax: update R set Ai = value where Aj = targetValue
– E.g. update emp set num_of_children = 2 where empno =
12345
181
© D. Wong 2003
2002
Some JOINS in SQL. (ref. pp. 270)

CROSS JOIN
--  R.A. cartesian product
e.g. Movie CROSS JOIN StarsIn;

JOIN … ON
--  R.A. theta-join
e.g. Movie JOIN StarsIn ON title = movieTitle AND year =
movieYear;

[NATURAL] JOIN
--  R.A. natural join
e.g. MovieStar NATURAL JOIN MovieExec; or
MovieStar JOIN MovieExec;

OUTERJOINS
-- joins that include dangling tuples
182
© D. Wong 2003
2002
OUTERJOINS

An operator to augment the result of a join by the
dangling tuples, padded with null values.

Full outerjoin of R1 and R2 is a join that includes all
rows from R1 and R2 matched or not. Unmatched rows
are padded with NULLs.

LEFT outerjoin of R1 and R2 is a join that includes all
rows from R1, matched or not, plus the matching values
from R2. Unmatched rows are padded with NULLs.

RIGHT outerjoin of R1 and R2 is a join that includes all
rows from R2, matched or not, plus the matching values
from R1. Unmatched rows are padded with NULLs.

The joining may be NATURAL or theta join
183
© D. Wong 2003
2002
Outerjoins Syntax
1.
R1 NATURAL {FULL | LEFT | RIGHT} OUTER
JOIN R2;
E.g. 1. MovieStar NATURAL FULL OUTER
JOIN MovieExec;
E.g. 2. MovieStar NATURAL LEFT OUTER
JOIN MovieExec;
E.g. 3. MovieStar NATURAL RIGHT OUTER
JOIN MovieExec;
184
© D. Wong 2003
2002
Outerjoins Syntax (continued)
1.
R1 {FULL | LEFT | RIGHT} OUTER JOIN R2
ON conditional expression;
E.g. 1. Movie FULL OUTER JOIN StarsIn ON
title = movieTitle AND year = movieYear;
E.g. 2. MovieStar LEFT OUTER JOIN StarsIn
ON title = movieTitle AND year = movieYear;
E.g. 3. MovieStar RIGHT OUTER JOIN StarsIn
ON title = movieTitle AND year = movieYear;
185
© D. Wong 2003
2002
Use result of joins as subqueries in queries

E.g.
SELECT title, year, length, inColor, studioName,
producerC#, starName
FROM Movie JOIN StarsIn ON
title = movieTitle AND year = movieYear;
186
© D. Wong 2003
2002
Week 10
187
© D. Wong 2003
2002
SQL (continued)

SQL continued

Oracle Sequence

Views
188
© D. Wong 2003
2002
SQL Correspondence to Relational Algebra
SELECT L
--  R.A. project
FROM R
--  R.A. operands
WHERE C ;
--  R.A. select
R.A. expression: L(sC(R))
When reading and writing queries:
1.
FROM
-- what relations are involved
2.
WHERE
-- what's the tuples selection criteria
3.
SELECT
-- what columns to output
189
© D. Wong 2003
2002
SQL and Relational Algebra

The six independent operations are implemented
by SQL

SQL is relational complete
190
© D. Wong 2003
2002
Some data values in SQL
1.
Strings
2.
Dates and Times
3.
Null values
4.
Truth value of Unknown
191
© D. Wong 2003
2002
4. UNKNOWN

Consider TRUE = 1, FALSE = 0, UNKNOWN =
0.5
1. AND of 2 truth-value = min. of the 2 values
2. OR of 2 truth-value = max. of the 2 values
3. Negation of v = 1-v

Refer Fig. 6.2 pp. 250 for truth table for 3-valued
logic
192
© D. Wong 2003
2002
SQL data types (pp. 292, 6.6.1)

Character strings:
 char(n)
-- fixed length
 varchar(n)
-- varying length

Integer
 int or integer
Floating-point numbers
 float, real



 dec (m, n), or decimal(m, n) -- floating point number
with fixed decimal places, where m > n.
e.g. dec(7, 2)
Dates and times
 date, time, timestamp, datetime
Refer to handout for SQL more data types description.
193
© D. Wong 2003
2002
SQL syntax: the Six Clauses in SQL Queries
1.
2.
3.
4.
5.
6.

SELECT
FROM
WHERE
-- required
-- required
GROUP BY
HAVING
-- if used, must follows a group by
clause, to choose the groups based on some aggregate
properties of the group itself
ORDER BY
-- to sort the output
E.g.
Select name, sum(length) from MovieExec, Movie
Where producerC# = cert#
group by name having min(year) < 1930
Order by name desc
194
© D. Wong 2003
2002
SQL syntax (continued)

Subqueries may appear in the FROM clause and
the WHERE clause

Comments begins with ‘--’

Tuple variables : an alias to a Relation, defined in
the from clause
– E.g.
Select Star1.name, Star2.name
From MovieStar Star1, MovieStar Star2
Where Star1.address = Star2.address and
Star1.name < Star2.name
195
© D. Wong 2003
2002
Interpreting Multi-relation Queries in SQL

To define the meaning of select-from-where
MovieStar(name, addr, gender, bDate)
name
addr
gender
bDate
J. Anniston
Addr ABC
F
8/88/88
H. Ford
Addr LMN
M
7/7/77
B. Pitt
Addr ABC
M
8/1/88
SELECT Star1.name, Star2.name
FROM MovieStar Star1, MovieStar Star2
WHERE Star1.addr = Star2.addr AND Star1.name <
Star2.name;
196
© D. Wong 2003
2002
3 Equivalent ways of interpretataion
1.
2.
Nested Loops
Parallel assignment
– No explicitly created nested loops
–
–
–
–
Consider all possible assignments of tuples from the
relations to the tuple variables
Each assignment that produce a true value from the
WHERE clause contribute a tuple to the answer
Construct the resulting tuples using the attributes of
the SELECT clause
Similar to the tuple based Datalog rule evaluation, the
WHERE clause serves the purpose of the arithmetic
subgoals
197
© D. Wong 2003
2002
Evaluation of Rules – Tuple based

Consider all assignments of tuples to subgoals that make
each subgoal true. If the variables are assigned consistent
values, add the head to the result.

Start with non-negated relational subgoals only
Example:
– Start with R(x,z) and R(z, y). Possible tuple assignments
to subgoals:
R(x, z)
R(z, y)

– Only 1 assignment gives
consistent value to z
(1, 2)
(1, 2)
(1, 2)
(2, 3)
– It also makes the third
subgoal true
– So, S = {(1, 3)}
(2, 3)
(1, 2)
(2, 3)
(2, 3)
198
© D. Wong 2003
2002
3. Conversion to R.A

FROM : Cartesian product (rename if necessary)

WHERE : as the selection criteria

SELECT : specify the projection operation
attributes
199
© D. Wong 2003
2002
Unintuitive Result of SQL Interpretation
Given: unary relations R, S, T
To compute: R (ST)
Consider query:
SELECT R.A
FROM R, S, T
WHERE R.A = S.A OR R.A = T.A;
Consider the case if T is empty.
200
© D. Wong 2003
2002
Some JOINS in SQL. (ref. pp. 270)

CROSS JOIN
--  R.A. cartesian product
e.g. Movie CROSS JOIN StarsIn;

JOIN … ON
--  R.A. theta-join
e.g. Movie JOIN StarsIn ON title = movieTitle AND year =
movieYear;

[NATURAL] JOIN
--  R.A. natural join
e.g. MovieStar NATURAL JOIN MovieExec; or
MovieStar JOIN MovieExec;

OUTERJOINS
-- joins that include dangling tuples
201
© D. Wong 2003
2002
OUTERJOINS

An operator to augment the result of a join by the
dangling tuples, padded with null values.

Full outerjoin of R1 and R2 is a join that includes all
rows from R1 and R2 matched or not. Unmatched rows
are padded with NULLs.

LEFT outerjoin of R1 and R2 is a join that includes all
rows from R1, matched or not, plus the matching values
from R2. Unmatched rows are padded with NULLs.

RIGHT outerjoin of R1 and R2 is a join that includes all
rows from R2, matched or not, plus the matching values
from R1. Unmatched rows are padded with NULLs.

The joining may be NATURAL or theta join
202
© D. Wong 2003
2002
Outerjoins Syntax
1.
R1 [NATURAL] {FULL | LEFT | RIGHT}
OUTER JOIN R2;
E.g. 1. MovieStar NATURAL FULL OUTER
JOIN MovieExec;
E.g. 2. MovieStar NATURAL LEFT OUTER
JOIN MovieExec;
E.g. 3. MovieStar NATURAL RIGHT OUTER
JOIN MovieExec;
203
© D. Wong 2003
2002
Outerjoins Syntax (continued)
2.
R1 {FULL | LEFT | RIGHT} OUTER JOIN R2
ON conditional expression;
E.g. 1. Movie FULL OUTER JOIN StarsIn ON
title = movieTitle AND year = movieYear;
E.g. 2. MovieStar LEFT OUTER JOIN StarsIn
ON title = movieTitle AND year = movieYear;
E.g. 3. MovieStar RIGHT OUTER JOIN StarsIn
ON title = movieTitle AND year = movieYear;
204
© D. Wong 2003
2002
Use result of joins as subqueries in queries

E.g.
SELECT title, year, length, inColor, studioName,
producerC#, starName
FROM Movie JOIN StarsIn ON
title = movieTitle AND year = movieYear;
205
© D. Wong 2003
2002
Conditions involving Relations
1.
2.
3.
4.
EXIST R – true iff R is not empty
S IN R – true iff S = one of the values in R
S NOT IN R
S > ALL R – true iff S > every value in R (> can be the
other comparison operators). Operator NOT applies.
S > ANY R – true iff S > at least 1 value in R (> can be the
other comparison operators). Operator NOT applies.
Case 1: S is scalar, R is unary
Case 2: S is a tuple, R is a multi-attribute relation (arity S
must = arity R)
When compare tuples, the first value has highest priority, and
subsequent value the next highest. E.g. (1, 7, 8) < (2, 0, 1)
Ref: 6.3-6.4 .
206
© D. Wong 2003
2002
Table level SQL (ref. 6.6, pp. 292)

Create table – to define the schema of a base table
(Ref. 6.6.1 for data types syntax)
E.g.
create table EMP
(
empno int not null,
lastName varchar(30) not null,
firstName varchar(30) not null,
num_of_children int,
constraint pk_EMP primary key (empno)
);

Drop table – to destroy a base table
e.g. drop table EMP;
207
© D. Wong 2003
2002
Tuple Modification Statements (ref. 6.5, pp. 286)

Insert – to add a row
Syntax: insert into R(A1..An) values (v1…vn)
– E.g. insert into emp(empno, lastName, firstName,
num_of_children) values (12345, ‘Doe’, ‘John’, 1)
– Or insert into emp values (12345, ‘Doe’, ‘John’, 1)

Read: P. 288 – Timing of Insertion.
Delete – to remove a row

Syntax: delete from R where <condition>
– E.g. delete from emp where empno = 12345
Update – to modify the contents of a row
Syntax: update R set Ai = value where Aj = targetValue
– E.g. update emp set num_of_children = 2 where empno =
12345
208
© D. Wong 2003
2002
Modifying Relation Schema

DROP TABLE – eliminate all data

Alternative is to ALTER TABLE:
1. To add a column
ALTER TABLE tablename ADD columnName typespec;
2. To delete a column
ALTER TABLE tablename DROP columnName;
3. To change an attribute size
ALTER TABLE tablename MODIFY (columnName
typespec);
209
© D. Wong 2003
2002
Default Value

Used when a value is not supplied for a tuple when
create or modify. System assume null when none
specified except for NOT NULL attributes.

Defined with data declarations:
e.g. Phone CHAR(16) DEFAULT ‘unlisted’
210
© D. Wong 2003
2002
Views

Relations that are defined by an expression like a
query.

Views can be queried, and can even be modified
in some cases

Declaring views in SQL:
CREATE VIEW <viewName> AS <view
definition>;

Use view to present what the user is allowed to
see

A view exists only within the environment of a
session.
211
© D. Wong 2003
2002
View Example

Given: base table
Movie(title, year, length, inColor, studioName,
presidentC#)
CREATE VIEW ParamountMovie AS
SELECT title, year
FROM Movie
WHERE StudioName = 'Paramount';
212
© D. Wong 2003
2002
SQL queries to View
1.
2.
3.
Has the same syntax as a stored relation
E.g.
Select title
From ParamountMovie
Where year = 1979;
When query to view, the tuples are obtained from the
base table.
Equivalent to the following query without view:
Select title
From Movie
Where studioName = 'Paramount' and ear = 1979;
Ref. Pp.308, 6.7.5 (Interpreting queries involving views)
213
© D. Wong 2003
2002
Update views



A view update is translated to modify the base table.
Only updatable views can be modified.
Views that can be updated:
–
–
Simple
Created by SELECT , but not SELECT DISTINCT,
some attributes from R :
1.
The WHERE clause must not involve R in a
subquery
2.
The select clause must include enough attributes
that for every tuple inserted into the view, the
other attributes can be filled out with NULL of
default value, and have a tuple of the base relation
that will yield the inserted tuple of the view.
214
© D. Wong 2003
2002
Update view problem e.g.

E.g.
INSERT into ParamountMovie VALUES ('Star
Trek', 1979);
Does not meet criteria 2 because:
The base relation Movie would have NULL rather
than 'Paramount' as StudioName
215
© D. Wong 2003
2002
Other operations on Views

DELETE tuple from a view = delete from base
table, with the condition defining the view added
to the condition of the delete statement

UPDATE view = update the base table, with the
condition defining the view added to the condition
of the update statement

DELETE view: does not affect the base table
DROP VIEW <viewName>;
216
© D. Wong 2003
2002
Oracle's Sequences

Sequential lists of numbers that are generated
automatically by the database.

Useful for creating unique surrogate key values
for primary key attribute when no desirable
attribute exist to use as the primary key

Data type of attributes using sequences is
NUMBER
217
© D. Wong 2003
2002
Oracle's Sequences (continued 1)



To create a sequence (SIMPLE FORM):
CREATE SEQUENCE <sequence name> [INCREMENT
BY <number>] [START WITH <start value number>];
e.g. CREATE SEQUENCE ClassSequence INCREMENT
BY 1 START WITH 1;
To access the next sequence value:
<sequence name>.NEXTVAL
To use <sequence name>.NEXTVAL in an insert
statement:
INSERT INTO <table name> VALUES (<sequence
name>.NEXTVAL, <field2> … );
e.g INSERT INTO class (classkey, classid, description,
section, location) VALUES (ClassSequence.NEXTVAL,
?, ?, ?, ?);
218
© D. Wong 2003
2002
Oracle's Sequences (continued 2)

To access the current sequence value:
<sequence name>.CURRVAL
may use in select, insert statements
May be used immediately after using the
NEXTVAL in the same database session.
 To view property of a sequence using the
"user_sequences" system table:
SELECT * FROM user_sequences;

To delete a sequence:
DROP SEQUENCE <sequence name>;
219
© D. Wong 2003
2002
Week 11
220
© D. Wong 2003
2002

Indexes

JDBC

JDBC in J2EE (Java 2 Enterprise Edition)
221
© D. Wong 2003
2002
Indexes

An index on an attribute A is a data structure to
improve query performance efficiency

Reason: not efficient to scan all tuples (for large
relations) in order to find the few that meet a
given condition

E.g.
SELECT * FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;

Not part of SQL standard
222
© D. Wong 2003
2002
Indexes Typical Syntax

To create index
CREATE INDEX indexName ON R(A1,…An)
E.g.
1. CREATE INDEX YearIndex ON Movie(year);
2. CREATE INDEX KeyIndex ON Movie(title, year);

Delete Index
DELETE INDEX yearIndex;
223
© D. Wong 2003
2002
Selection of Indexes

Index design requires an estimate of the typical
mix of queries and other operations on the db

Example of good use of indexes:
1. An attribute frequently compared to constant
in a where clause of a query
2. Attribute that appear frequently in join
operations
e.g. SELECT name FROM Movie, MovieExec
WHERE title = ‘status’ AND producerC#
= cert#;
224
© D. Wong 2003
2002
Decision factors

Important to strike a balance.

Factors:
1. Given attribute A, and index on A will:
–
–
Greatly speed up queries with a condition
on that attribute
May speed up joins involving A
2. Index make insertion, deletion, and updates
more complex and time-consuming
225
© D. Wong 2003
2002
Indexes (continued)

Techniques to execute SQL queries are intimately
associated with storage structures. Typically, a
relation is stored in many disk blocks.

An index is an auxiliary structure, perhaps stored
in a separate file, that support fast access to the
rows of a table.

Main cost of a query or modification is I/O:
No. of disk blocks to be read into memory and
write onto disk
226
© D. Wong 2003
2002
Index Selection Example

Given Relation: StarsIn(movieTitle, movieYear, starName)

3 operations to perform:
1. Q1: select movieTitle, movieYear From StarsIn
where starName = s;
2. Q2: select starName from StarsIn where movieTitle
= t and movieYear = y;
3. I: insert into StarsIn values(t, y, s);
where s, t, y are some constants
227
© D. Wong 2003
2002
Example's assumptions
1.
Cost for examining 1 disk block = 1 unit
2.
StarsIn is stored in 10 disk blocks
3.
Average no. of stars in a movie = 3
4.
Average no. of movies that a star appeared in = 3
5.
Tuples for a given star or movie are likely to be
spread over the 10 disk blocks of StarsIn
6.
One disk access is needed to read a block of the
index every time when the index is used to locate
tuples with a given value for the indexd
attribute(s)
228
© D. Wong 2003
2002
Example's Estimated Cost of actions
Action No Index
Star Index
Movie Index Both Indexes
Q1
10
4
10
4
Q2
10
10
4
4
I
2
4
4
6
Cost of 3
actions
2+8p1+8p2
4+6p2
4+6p1
6-2p1-2p2
Costs associated with the three actions, as a function of which
indexes are selected (Ref. Fig. 6.17 2nd ed.)
Star index is an index on StarName,
Movie index is an index on MovieTitle and movieYear.
The numbers in rows 2-5 of the table are no. of disk accesses for the action.
229
© D. Wong 2003
2002
Example's usage scenarios

The fraction of the time to do Q1 = p1, Q2 = p2,
I=1-p1-p2

Consider:
– Case 1: p1 = p2 = 0.1
– Case 2: p1 = p2 = 0.4
– Case 3: p1=0.5, p2=0.1
What is the best index strategy for each case?

Create only the index that helps the most
frequently used query type (e.g. query about stars
=> create Star index)
230
© D. Wong 2003
2002
JDBC (Java DataBase Connectivity)

An API to the database driver manager

Provides call-level interface for the execution of
SQL statements from a Java language program

Developed by SUN Microsystems

An integral part of the Java language

Java applications use the JDBC dialect of SQL,
independent of the DBMS used

Support applications to request information about
the schema from the DBMS at run time.
231
© D. Wong 2003
2002
JDBC Interaction with SQL DB
1.
Make connection to the database
2.
Create SQL statement
3.
Execute the SQL statement
4.
Result table from the SQL "select" statement is
returned as a Java object. JDBC provide
methods to access the rows and columns.
OR
5.
SQL statements return simple integer results that
represent the number of affected rows (e.g.
insert)
232
© D. Wong 2003
2002
Connecting to a db through JDBC
Driver 1
Application
Driver
Manager
Driver 2
Driver 3
DBMS
server
JDBC Modules
233
© D. Wong 2003
2002
JDBC - java.sql package


Defines a collection of interfaces and classes that allow
programs to interact with db.
Interfaces for primary SQL execution:
–
Driver: supports data connection creation
–
Connection: represents connection between a java client
and and SQL database server
–
Statement: includes methods for executing text queries
–
PreparedStatement: represents a precompiled and stored
query
–
CallableStatement: used to execute SQL stored procedures
–
ResultlSet: contains the results of a query
–
ResultSetMetaData: information about a ResultSet,
including the attribute names and types
234
© D. Wong 2003
2002
Client-server Architectures
User Interface /
Application
User Interface /
Application
User and
Application tier
Middleware
Middle tier
Application
Database
Server
Database
Server
235
Database
Server tier
© D. Wong 2003
2002
JDBC in J2EE

J2EE – Java 2 Enterprise Edition, a middle layer server

Connection to DBMS using JDBC (e.g. Cloudscape,
Oracle, MS SQL)

J2EE Platform – services and architecture

Enterprise JavaBeans (EJB)
– Session Beans vs. Entity Beans

EJB access to databases using JDBC
– Database connection
– Persistence management (Entity Bean e.g.)
– Transaction management (Session Bean e.g.)
236
© D. Wong 2003
2002
J2EE Services

HTTP - enables Web browsers to access servlets
and JavaServer PagesTM (JSP) files

EJB - allows clients to invoke methods on
enterprise beans

Authentication - enforces security by requiring
users to log in

Naming and Directory - allows programs to locate
services and components through the Java
Naming and Directory InterfaceTM (JNDI) API
237
© D. Wong 2003
2002
J2EE Architecture
Ref. JavaTM 2 Enterprise Edition Developer's Guide, Figure 1-2
238
© D. Wong 2003
2002
Enterprise JavaBeans (EJB)

Server-side Java components

Contain the business logic of enterprise application

Support database access

Transactional

Multi-user secure

Managed by the EJB container

Prohibited from a set of operations
239
© D. Wong 2003
2002
Session Bean vs. Entity Bean
Session Bean
Entity Bean
Purpose
Performs a task for a Represents a business
client
entity object that exists in
persistent storage.
Shared
Access
May have one client.
May be shared by multiple
clients.
Persistence
Not persistent.
Persistent. Entity state
remains in a database.
Ref. JavaTM 2 Enterprise Edition Developer's Guide, Table 1-1
240
© D. Wong 2003
2002
EJB Access to Databases Using JDBC API

J2EE uses
1. JDBC 2.0 (java.sql) and
2. JDBC 2.0 Optional package (javax.sql)

To make a connection to database in J2EE :
1. Should not hardcode the actual name (URL) of the
database in EJB
2. Should refer to the database with a logical name
3. Use a JNDI lookup when obtaining the database
connection.
241
© D. Wong 2003
2002
Driver and Data source properties
In J2EE configuration file, resource.properties, specify:
 Driver
e.g. 1 Cloudscape that is packaged with Sun’s J2EE
jdbcDriver.0.name=COM.cloudscape.core.RmiJdbcDriver
e.g. 2 Oracle
jdbcDriver.0.name= oracle.jdbc.driver.OracleDriver
 JDBC URL
e.g. 1 Cloudscape
jdbcDataSource.0.name=jdbc/Cloudscape
jdbcDataSource.0.url=jdbc:cloudscape:rmi:CloudscapeDB;create=true
e.g. 2 Oracle
jdbcDataSource.0.name=jdbc/Oracle
jdbcDataSource.0.url= jdbc:oracle:thin:@bigoh.cis.uab.edu:1521:cs610
242
© D. Wong 2003
2002
Making a connection to database example
1. Specify the logical database name.
private String dbName = "java:comp/env/jdbc/AccountDB";
2. Obtain the DataSource associated with the logical name.
InitialContext ic = new InitialContext();
DataSource ds = (DataSource) ic.lookup(dbName);
3. Get the Connection from the DataSource.
Connection con = ds.getConnection(username, password);
243
© D. Wong 2003
2002
Specifying JNDI name for deployment Step 1: Enter the code name
244
© D. Wong 2003
2002
Step 2: Map the coded name to the JNDI name
245
© D. Wong 2003
2002
Persistence Management
Container-Managed Persistence
• Entity bean code does not contain database access calls.
• The EJB container generates the SQL statements.
Bean-Managed Persistence
• Entity bean code contains the database access calls
(SQLs) (i.e. you write the code!)
246
© D. Wong 2003
2002
Container Managed example: Product entity bean
ProductEJB.java
ProductHome.java
Product.java
ProductClient.java
Bean Managed example: Account entity bean
AccountEJB.java
AccountHome.java
Account.java
AccountClient.java
247
© D. Wong 2003
2002
248
© D. Wong 2003
2002
249
© D. Wong 2003
2002
250
© D. Wong 2003
2002
Week 12
251
© D. Wong 2003
2002
EJB Access to Databases Using JDBC API

J2EE uses
1. JDBC 2.0 (java.sql) and
2. JDBC 2.0 Optional package (javax.sql)

To make a connection to database in J2EE :
1. Should not hardcode the actual name (URL) of the
database in EJB
2. Should refer to the database with a logical name
3. Use a JNDI lookup when obtaining the database
connection.
252
© D. Wong 2003
2002
Driver and Data source properties
In J2EE configuration file, resource.properties, specify:
 Driver
e.g. 1 Cloudscape that is packaged with Sun’s J2EE
jdbcDriver.0.name=COM.cloudscape.core.RmiJdbcDriver
e.g. 2 Oracle
jdbcDriver.0.name= oracle.jdbc.driver.OracleDriver
 JDBC URL
e.g. 1 Cloudscape
jdbcDataSource.0.name=jdbc/Cloudscape
jdbcDataSource.0.url=jdbc:cloudscape:rmi:CloudscapeDB;create=true
e.g. 2 Oracle
jdbcDataSource.0.name=jdbc/Oracle
jdbcDataSource.0.url= jdbc:oracle:thin:@bigoh.cis.uab.edu:1521:cs610
253
© D. Wong 2003
2002
Making a connection to database example
1. Specify the logical database name.
private String dbName = "java:comp/env/jdbc/AccountDB";
2. Obtain the DataSource associated with the logical name.
InitialContext ic = new InitialContext();
DataSource ds = (DataSource) ic.lookup(dbName);
3. Get the Connection from the DataSource.
Connection con = ds.getConnection(username, password);
254
© D. Wong 2003
2002
Specifying JNDI name for deployment Step 1: Enter the code name
255
© D. Wong 2003
2002
Step 2: Map the coded name to the JNDI name
256
© D. Wong 2003
2002
Persistence Management
Container-Managed Persistence
• Entity bean code does not contain database access calls.
• The EJB container generates the SQL statements.
Bean-Managed Persistence
• Entity bean code contains the database access calls
(SQLs) (i.e. you write the code!)
257
© D. Wong 2003
2002
Container Managed example: Product entity bean
ProductEJB.java
ProductHome.java
Product.java
ProductClient.java
Bean Managed example: Account entity bean
AccountEJB.java
AccountHome.java
Account.java
AccountClient.java
258
© D. Wong 2003
2002
259
© D. Wong 2003
2002
260
© D. Wong 2003
2002
261
© D. Wong 2003
2002
Transaction Management
Ref. JavaTM 2 Enterprise Edition Developer's Guide
262
© D. Wong 2003
2002
J2EE Resources
• JavaTM 2 SDK, Enterprise Edition Technical Documentation
JavaTM 2 Enterprise Edition Developer's Guide
• http://java.sun.com/j2ee/j2sdkee/
• http://archives.java.sun.com/archives/j2ee-interest.html
• Developing Enterprise Applications with the JavaTM 2
Platform Enterprise Edition
http://java.sun.com/j2ee/blueprints/
• Major commercial implementations:
•WebLogic - Bea
•Websphere - IBM
265
© D. Wong 2003
2002
.Net Data Providers

Purpose:
– connect, read, and execute commands against
data sources
– other functions, such as the management of
input and output parameters, security,
transactions, and database server errors.
266
© D. Wong 2003
2002
.NET data provider classes
1.
SqlConnection
2.
SqlCommand
3.
SqlDataReader - can use for read-only
applications from a SQL Server data source
4.
SqlDataAdapter - acts as a bridge between a
remote SQL Server data source and a DataSet
class instance inside a Visual Basic .NET solution
267
© D. Wong 2003
2002
Semi-structured Data Model (Ref. 4.6)

Blends class and relation for
1. Flexibility, therefore suitable for integration of
database.
2. Serve as a document model in notation such
as XML
3. Schemaless – the data is “self-describing”, the
schema is attached to the data itself, which
can vary arbitrarily over time and within a
single db.
268
© D. Wong 2003
2002
Semi-structured Data Model Utility

An information integration tool to integrate legacy
databases

Legacy database problem:
The problem of integrating multiple databases
designed independently, and used over time for
many different applications.
269
© D. Wong 2003
2002
Semi-structured Data Representation

A db of semi-structured data is a collection of
nodes

Node types:
– Leaf – have associated data of atomic types (e.g.
numbers, strings)
– Interior – have 1 or more arcs out
– Root – an interior node has no arcs entering it,
representing the entire db (I.e. the entry point)

Every node must be reachable from the root.
270
© D. Wong 2003
2002
Semistructured Data Representation (continued)

Arc – has a label to indicate relationship between
2 nodes. Consider arc L connecting nodes M, N,
the 2 roles that an arc serves:
1. If M is an object, N is an attribute, then L
represents the name of the attribute
2. If M and N are both objects, L is the name of
the relationship from N to M

The structure is a graph, not necessarily a tree.

E.g. Figure 4.19
271
© D. Wong 2003
2002
XML






XML : eXensible Markup Language.
Tag-based notation for “marking” documents
Plain text
Formats of tag:
1. <tag> string of data </tag>
e.g. <phone>123-456-7890</phone>
2. <tag/>
<!-- empty tag-->
e.g. <flag/>
Two modes:
– Well-formed XML
– Valid XML
Comment format: <!– your comments -->
272
© D. Wong 2003
2002
HTML vs. XML

HTML tags
For describing the presentation of the data (e.g.
<B> abc </B>) by a web browser

XML tags
For describing the meaning of the data (e.g.
<phone> 123-456-7890 </phone>)
273
© D. Wong 2003
2002
Well-formed XML
Requirements:

Document begin with a declaration (prologue):
<?xml version="1.0"? STANDLONE=“yes”?>

It has a root.

Every opening tag is followed by a matching closing tag,
and the elements are properly nested inside each other.
E.g. <BODY> … </BODY>

Any attribute can occur at most once in a given opening
tag, its value must be provided and quoted.

E.g Figure 4.21
274
© D. Wong 2003
2002
Well-formed XML (continued)

Invent your own tags.

Resembles semi-structure data model:
– Schemaless
– Self-describing
– Object like
275
© D. Wong 2003
2002
Valid XML

Uses a Document Type Definition (DTD) to
specify:
– Allowable tags
– Grammar for how the tags may be nested.

More flexible than a strict-schema model
– E.g. allow optional or missing fields

More restrictive than a completely schemaless
model
276
© D. Wong 2003
2002
XML elements and Entities/Objects (1)

XML evolved from a document markup language
(SGML) rather than a database language
e.g. <Address>
John lives on
<Street> Main St </Street>
house number
<number>123</number>
In a remote township
</Address>

Mixed data and text is a hindrance for database
277
© D. Wong 2003
2002
XML elements and Entities/Objects (2)

XML elements are ordered whereas the attributes
and the tuples in a relation are not

E.g.
<Address> <Number>123-4567</Number>
<Street>Main St. </Street> <Address>
 <Address> <Street>Main St. </Street>
<Number>123-4567</Number> </Address>
But, these tuples are equivalent:
Number
Street
123-4567 Main St.
278
Street
Number
Main St.
123-4567
© D. Wong 2003
2002
XML Element types

Complex types
– Elements that contain sub-elements or carry attributes
e.g. <message to=“[email protected]"
from=“[email protected]" subject="XML">
<text>
XML basics …
</text>
</message>

Simple types
– elements that contain numbers (and strings, and dates,
etc.) but do not contain any subelements
– Built-in or author defined

Attributes always have simple types.
279
© D. Wong 2003
2002
Document Type Definition (DTD)

It’s set of rules for constructing an XML document.
– i.e. a grammar that specifies the schema of a legal
XML document

Specified by XML authors:
– As part of the document itself, or
– Stored separate from the document. The
document refers to the URL where the DTD is
stored.

An XML document that conforms to it’s DTD is
said to be valid.
280
© D. Wong 2003
2002
DTD Structure

Basic DTD structure:
<!DOCTYPE root-tag [
<!ELEMENT element-name(components)>
more elements
]>




The name of the DTD must coincide with the tag name of
the root element of the document that conforms to that
DTD.
One ELEMENT statement for each allowed tag, including
the root tag
For each tag that can have attributes, the ATTLIST
statement specifies the allowed attributes and their type.
Component of #PCDATA means simple text (i.e. no tags
nested within)
281
© D. Wong 2003
2002
DTD Structure (continue)




Component occurrence indicators:
– * : zero or more
– + : one or more
– ? : zero or one time
– | : either or
e.g. (#PCDATA | (STREET CITY)
Empty tags: <!ELEMENT elementName, EMPTY>
e.g. <!ELEMENT Flag EMPTY>
E.g. Figure 4.22 (2n. Ed.) – a DTD for movie stars
Figure 4.23 (2n. Ed.) – a document following DTD in
Figure 4.22
282
© D. Wong 2003
2002
Using a DTD

Two ways:
1. Include the DTD at the beginning of the
document
2. In the document’s opening line, refer to the
DTD that is stored separately in a file system
accessible to the application that is processing
the document.
e.g.
<?XML VERSION = “1.0” STANDALONE = “no”?>
<!DOCTYPE Stars SYSTEM “star.dtd”>
283
© D. Wong 2003
2002
Attribute Lists

Format:
<!ATTLIST ElementName
attributeName type [Modifier]
...
>
284
© D. Wong 2003
2002
Some attribute types
1.
2.
CDATA
<!-- simple text -->
ID
 An attribute of type ID must have a unique value through
out the document
 E.g. if attr1 and attr2 are ID type, then
<elt1 attr1="123"> and <elt2 attr2="123"> is illegal
 Imitate key in relational db
3.
IDREF
 An attribute of type IDREF must refer to a valid ID
declared in the same document
 Imitate foreign key
4.
IDREFS
 Refer to a list of valid IDs declared in the document
285
© D. Wong 2003
2002
Some Modifier Types

#IMPLIED
– The attribute is optional.
– Can remain unspecified

#REQUIRED
– The attribute is mandatory.
– E.g.
movieId ID #REQUIRED
286
© D. Wong 2003
2002
Example: Use DTD to alleviate the order problem
E.g.:
<!ELEMENT Person ((Name, Id, Address) |
(Name, Address, Id) | (Id, Address, Name) | (Id,
Name, Address) | (Address, Id, Name) | (Address,
Name, Id))>

But, it's awkward!
287
© D. Wong 2003
2002
XML Schema

A DDL for XML documents

Purpose: to define a class of XML documents

Instance document: an XML document that
conforms to a particular schema (schema valid).

Instances and schemas may exist as:
– documents in files
– streams of bytes sent between applications
– fields in a database record
– collections of XML "Infoset "Information Items
288
© D. Wong 2003
2002
Triggers

Available in Oracle and SQL99

Event-Condition-Action rules: define an action
the db should take when some db-related events
occurs
289
© D. Wong 2003
2002
Trigger vs. other constraints
1.
Triggers are awakened when certain events,
specified by db programmer, occur. E.g. update,
insert, delete to a relation; end of a transaction
2.
Other constraints immediately prevent the event
if the constraint is violated. For trigger, it's
condition is tested when the event occurs, if the
condition does not hold, then the action
associated with the trigger will not happen (I.e.
trigger will not be fired)
3.
If the trigger condition is true, the action is
performed by the DBMS (I.e. trigger fired). So,
it's transparent to the user
290
© D. Wong 2003
2002
Trigger Syntax
CREATE [OR REPLACE] TRIGGER <triggerName>
{BEFORE | AFTER | INSTEADOF}
{DELETE | INSERT | UPDATE [of column, …] }
[OR {DELETE | INSERT | UPDATE [of column, …] }
ON {tableName | viewName}
[REFERENCING { OLD [AS] <oldName> , NEW [AS]
<newName> …]
FOR EACH {ROW | STATEMENT}
[WHEN (condition) ] <action>;
291
© D. Wong 2003
2002
Oracle Triggers

<action> is PL/SQL block
Simplest is :
BEGIN
<SQL statement>
END;
In the PL/SQL action block, variables OLD and NEW are preceded
by : e.g. :OLD

Follow the create trigger statement with a Dot (.) and then RUN to
store the definition in the db

The action cannot change the relation that triggers the action, nor to
relations connected to the triggering relation by a constraint e.g. FK
constraint

Read 7.4.2 – 7.4.4
292
© D. Wong 2003
2002
SQL3 Triggers

<action> can be:
1.
a single SQL statement
2.
A SQL statements, separated by ; enclosed in a
BEGIN
<SQL statements>
END;
293
© D. Wong 2003
2002
Constraints Summary
1.
Primary Key declaration
2.
Foreign Key – referential integrity constraint
3.
Constraints within relations:
 Attribute constraints: 1. NOT NULL; 2.
CHECK
 Tuple based CHECKs
4.
5.
Schema level constraints – SQL2 assertions (not
in Oracle)
Triggers – Oracles's and SQL99's
294
© D. Wong 2003
2002
Week 13
295
© D. Wong 2003
2002
Java API for XML Processing (JAXP)



For processing XML data using applications written in
Java
JAXP APIs - javax.xml.parsers package
Leverages 2 parser standards :
 SAX (Simple API for XML Parsing)
 parse the data as a stream of events, a serial-access
mechanism that does element-by-element processing
 DOM (Document Object Model) (easier to use)
 build a tree structure of objects to represent the
data
 org.w3c.dom
 Detail review of the enrollment example (files posted
on the course web page as JAXP example)

http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JAXPIntro5.html
296
© D. Wong 2003
2002
Ch. 7 Constraints and Triggers

Keys as constraint

Foreign key constraints and referential integrity

Constraints on attributes

Constraints on tuples

Assertions

Triggers
297
© D. Wong 2003
2002
Keys as Constraint



One primary key per relation
Declared in CREATE TABLE
2 ways:
1. List with attribute (when only 1 attribute in key) e.g.
snum char (5) PRIMARY KEY
2. Add to the list of items declared in the schema. E.g.
CREATE TABLE supplier (
snum CHAR(3),
sname VARCHAR(15),
status INT,
city VARCHAR(15)
CONSTRAINT pk_supplier PRIMARY KEY(snum)
);
298
© D. Wong 2003
2002
Keys (continued)

Primary Key attribute cannot be NULL

2 tuples in R cannot agree on all of the attributes in the
primary key. DBMS rejects violating insertions or updates.

Can also declare a key with UNIQUE
– Can be used instead of PRIMARY KEY
– Can have multiple UNIQUE in a table
– UNIQUE attributes permit NULL

Key constraints are checked at insert and updates

Use of index created with key attributes for efficient key
enforcement checks
299
© D. Wong 2003
2002
Foreign-Key Constraints

To enforce referential integrity constraints

E.g. SPJ (type 1), S, P, J (type 2)

The reference attribute of the relation of type 2
must be declared PRIMARY or UNIQUE

Again 2 ways to declare in CREATE TABLE
1. Inline with declaration
2. Add to the list of items declared in the schema
300
© D. Wong 2003
2002
FK as list of items declared in the schema e.g.
CREATE TABLE spj (
snum CHAR(3),
pnum CHAR(3),
jnum CHAR(3),
qty INT,
CONSTRAINT fk_spj_1 FOREIGN KEY (snum) REFERENCES
S(snum) DEFERRABLE INITIALLY DEFERRED,
CONSTRAINT fk_spj_2 FOREIGN KEY (pnum) REFERENCES
P(pnum) DEFERRABLE INITIALLY IMMEDIATE,
CONSTRAINT fk_spj_3 FOREIGN KEY (jnum) REFERENCES
J(jnum)
-- this one just use default, i.e. not DEFERRABLE
);
301
© D. Wong 2003
2002
FK – maintaining referential integrity

3 Policies:
1. Default : Reject violating Modifications
2. Cascade : changes to the referenced attributes are
mimicked at the foreign key
3. Set Null : set the foreign key to null when the
referenced attributes are changed (delete / update)

Declare for delete and update separately

Declare with the declaration of foreign key

Dangling tuples – violate referential integrity constraint
for the foreign key. Dealt with the 3 policies.
302
© D. Wong 2003
2002
Maintaining referential integrity - Example
CREATE TABLE spj (
snum CHAR(3),
pnum CHAR(3),
jnum CHAR(3),
qty INT,
CONSTRAINT fk_spj_1 FOREIGN KEY (snum) REFERENCES
S(snum) ON UPDATE CASCADE DEFERRABLE INITIALLY
DEFERRED,
CONSTRAINT fk_spj_2 FOREIGN KEY (pnum) REFERENCES
P(pnum) ON DELETE SET NULL DEFERRABLE INITIALLY
IMMEDIATE,
CONSTRAINT fk_spj_3 FOREIGN KEY (jnum) REFERENCES
J(jnum)
-- just use default
);
303
© D. Wong 2003
2002
Constraints on Attributes and Tuples

Limits the values of some attributes.

Constraints within relations

Defined in relation's schema as:
1. Constraints on an attribute
2. Constraints on tuples
304
© D. Wong 2003
2002
Constraints on the attributes

Not-null constraints
e.g. pnum CHAR(3) NOT NULL,
CONSTRAINT fk_spj_2 FOREIGN KEY (pnum)
REFERENCES P(pnum) ,

Attribute-Based CHECK
– Checked whenever any tuple gets a new value for this
attribute (e.g. update, insert)
– Reject violating operations
– The check is expressed as a conditional expression
e.g. qty INT CHECK (qty <= 10000),
– Invisible to other relations (so, can be violated if the
relations referenced in the condition are changed)
– More examples in text Pp. 328 ex. 7.8, 7.9
305
© D. Wong 2003
2002
Tuple-Based CHECK Constraints

Applies to one or more attributes of a table

Checked when a tuple is inserted or updated. Reject if it's
a violating operation.

Invisible to other relations (so, can be violated if the
relations referenced in the condition are changed)

E.g.
CREATE TABLE MovieStar(
name CHAR(30) PRIMARY KEY,
gender CHAR(1),
CHECK (gender = 'F' OR name NOT LIKE'Ms.%')
);
306
© D. Wong 2003
2002
Modification of Constraints

Only to constraints that are named.

Naming of constraints examples:
1. snum char (5) CONSTRAINT pk_supplier
PRIMARY KEY ,
2. CONSTRAINT fk_spj_1 FOREIGN KEY (snum)
REFERENCES S(snum) DEFERRABLE,
3. Gender CHAR(1) CONSTRAINT NoAndro
CHECK (gender in ('F', 'M')),
307
© D. Wong 2003
2002
Modifications

Delete constraints
ALTER TABLE <tableName> DROP CONSTRAINT
<constraintName>;

Add constraint
ALTER TABLE <tableName> ADD CONSTRAINT
<constraintName> <constraint specification>;
e.g. ALTER TABLE student ADD CONSTRAINT
PK_student PRIMARY KEY (studentkey);

Set constraint – to change a deferrable constraint from
immediate to deferred and vise versa
SET CONSTRAINT <constraintName> DEFERRED;
308
© D. Wong 2003
2002
Assertions







An assertion is a schema level CHECK constraint.
Exist independent of any particular table.
It's condition can refer to any table in the db schema
Declaring an assertion:
CREATE ASSERTION <constraintName>
CHECK <condition>
[constraint attributes];
constraint attributes: [NOT] DEFERRABLE; INITIALLY
IMMEDIATE; INITIALLY DEFERRED
The constraint is violated if the condition is false
Any db modifications that causes a violation will be
rejected
309
© D. Wong 2003
2002
Triggers

Available in Oracle and SQL99

Event-Condition-Action rules: define an action
the db should take when some db-related events
occurs

Triggers are useful for:
1. To maintain complex integrity constraints
2. To audit changes made to table
3. To signal to other programs that changes were
made to a table
310
© D. Wong 2003
2002
Trigger vs. other constraints
1.
Triggers are awakened when certain events,
specified by db programmer, occur. E.g. update,
insert, delete to a relation; end of a transaction
2.
Other constraints immediately prevent the event
if the constraint is violated. For trigger, it's
condition is tested when the event occurs, if the
condition does not hold, then the action
associated with the trigger will not happen (I.e.
trigger will not be fired)
3.
If the trigger condition is true, the action is
performed by the DBMS (I.e. trigger fired). So,
it's transparent to the user
311
© D. Wong 2003
2002
Trigger Syntax
CREATE [OR REPLACE] TRIGGER <triggerName>
{BEFORE | AFTER | INSTEADOF}
{DELETE | INSERT | UPDATE [of column, …] }
[OR {DELETE | INSERT | UPDATE [of column, …] }
ON {tableName | viewName}
[REFERENCING { OLD [AS] <oldName> , NEW [AS]
<newName> …]
FOR EACH {ROW | STATEMENT}
[WHEN (condition) ] <action>;
312
© D. Wong 2003
2002
Oracle Triggers

<action> is PL/SQL block
Simplest is :
BEGIN
<SQL statement>
END;
In the PL/SQL action block, variables OLD and NEW are preceded
by : e.g. :OLD

Follow the create trigger statement with a Dot (.) and then RUN to
store the definition in the db

The action cannot change the relation that triggers the action, nor to
relations connected to the triggering relation by a constraint e.g. FK
constraint

Read 7.4.2 – 7.4.4
313
© D. Wong 2003
2002
SQL3 Triggers

<action> can be:
1.
a single SQL statement
2.
A SQL statements, separated by ; enclosed in a
BEGIN
<SQL statements>
END;
314
© D. Wong 2003
2002
Constraints Summary
1.
Primary Key declaration
2.
Foreign Key – referential integrity constraint
3.
Constraints within relations:
 Attribute constraints: 1. NOT NULL; 2.
CHECK
 Tuple based CHECKs
4.
5.
Schema level constraints – SQL2 assertions (not
in Oracle)
Triggers – Oracles's and SQL99's
315
© D. Wong 2003
2002
Ch. 8 - System Aspects of SQL

SQL in programming environments
– Statement Level Interface
Embedded
SQL, SQLJ
Dynamic SQL
– PSM
Oracle’s
PL/SQL
– Call Level Interface

SQL environment
316
© D. Wong 2003
2002
Host Language Interface

Host language: conventional language that applications are
written in. e.g C, Java, C#, Visual Basic .Net

Statement level interface
– Static: Embedded SQL, SQLJ
– Dynamic SQL

Call level interface (CLI)

Use of CLI (e.g. JDBC) for static transaction is less
efficient at run time than statement-level interfaces
because CLI’s preparation and execution generally involve
separate communications with the DBMS, which is costly.
317
© D. Wong 2003
2002
Embedded SQL

Shared variables
– Host language variables that can be read/written by
SQL statement
– Serves as interface between the host language and the
SQL execution system
– When used in embedded SQL statements, precede the
variable with a : e.g. :name

Embedded SQL statements are prefixed with EXEC SQL

Preprocessor translate the EXEC SQL statements into
function calls in host language, then compile, linked with
SQL-related library to form executable code. Fig. 8.1 pp
351
318
© D. Wong 2003
2002
Embedded SQL (continued)

SQLSTATE
– a special variable (an array of 5 char)
– Serves to connect the host-language program
with the SQL execution system
– updated each time a SQL library function is
called
– Some of the codes are:
‘00000’
= no error
‘02000’ = no tuple found
319
© D. Wong 2003
2002
Embeddable SQL statements

Any SQL statement that does not return a result (I.e. not
a select-from-where query) can be embedded. E.g. insert,
delete, create, update.

select-from-where queries are not embeddable directly.

For connecting the result of queries with a host-language
program, must use one of the following methods:
1. Use single-row select for query that produces a single
tuple : select-into-from-where. Ref. fig. 8.3 pp 355
EXEC SQL SELECT A INTO :var FROM R WHERE
<cond>;
2. Use CURSOR for queries producing more than one
tuple Fig. 8.4 pp 357
320
© D. Wong 2003
2002
CURSOR - Ref. Fig. 7.4 pp.379 (Fig. 8.4 pp 357)





An object used to store the output of a query for processing
in an application. It provides the mechanism to reference
the current position in a result set, and to do positioned
updates and deletes.
Declare cursor in embedded SQL by:
EXEC SQL DECLARE <cursor> CURSOR FOR <query>
Open cursor before use:
EXEC SQL OPEN <cursor>
Use Fetch statement to get the next tuple of the relation
over which the cursor range
EXEC SQLFETCH FROM <cursor> INTO <list of shared
variables>
SQLSTATE of ‘02000’ means no more tuples found
To close a cursor when done: EXEC SQL CLOSE
321
© D. Wong 2003
2002
Cursor options


INSENSITIVE
guarantee that changes to underlying relation made
between one opening and closing of cursor will not
affect the set of tuples already fetched.
FOR READ ONLY
DBMS will prevent modifications to the underlying
relation through this cursor

SCROLL
The cursor may be accessed with any one of the :
FETCH {NEXT / PRIOR/ FIRST / LAST /RELATVE n/
ABSOLUTE n}

ResultSet in JDBC is a cursor
322
© D. Wong 2003
2002
SQLJ


ANSI standard Statement-level Interface to JAVA
A dialect of embedded SQL that can be included in JAVA
program

Goal:
To obtain the run-time efficiency of embedded SQL for
Java applications while retaining the advantage of
accessing DBMS through JDBC

Embedded SQLJ constructs are replaced by calls to an
SQLJ run-time package, which access database using
JDBC.
Benefit: the pre-compiler (translator in Oracle) can check
SQL syntax and the number and types of arguments and
results.

323
© D. Wong 2003
2002
Differences between SQLJ, Embedded SQL and JDBC
1.
SQLJ supports essentially SQL-92, much more portable
across DBMS vendors. For embedded SQL, each DBMS
vendor supports its own proprietary version of SQL.
2.
An SQLJ clause in a JAVA program begin with #SQL
instead of EXEC SQL, and can contain select-from-where
statement inside { }
Any JAVA variables can be included as a parameter in an
SQL statement (prefix the variable with : ) , same as in
embedded SQL
In SQLJ, a query returns an SQLJ interator object
instead of a ResultSet object (as in JDBC). But, it’s
similar to ResultSet in that they provide a cursor
mechanism.
Both SQLJ statement and JDBC calls can be included in
the same Java program
3.
4.
5.
324
© D. Wong 2003
2002
Oracle’s SQLJ script

To use: sqlj file.sqlj

Many command lines options are availabe, refer to
Oracle developer guide at
http://otn.oracle.com/tech/java/sqlj_jdbc/pdf/a96655.pdf

Invokes translator to preprocess java programs
with SQLJ clauses (suffix of those files can be
.sqlj, .java). Translator generates .java file which
contains the JDBC calls for the SQL statements

Invokes javac to compile

Invokes JVM to execute
325
© D. Wong 2003
2002
Dynamic SQL

Used when the SQL statement is not known at
compile time

E.g. an application that prompt the user for an
SQL query, read, and then execute the query
(can you think of an application?)

Host language program must instruct SQL to:
1. Take a string and turn it into an executable
SQL statement:
EXEC SQL PREPARE sqlvar FROM sharedVar;
2. Execute the prepared statement:
EXEC SQL EXECUTE sqlvar;
326
© D. Wong 2003
2002
Dynamic SQL (continued)

Steps 1 and 2 can be combined by:
EXEC SQL EXECUTE IMMEDIATE sharedVar;

The 2 steps approach is beneficial if a prepared
statement is execute multiple times.

Ref. Fig. 8.7 pp. 368
327
© D. Wong 2003
2002
Week 14
328
© D. Wong 2003
2002
Persistent, Stored Modules (PSM) (Ref. 8.2 )

A way to create and store procedures or functions
with a database schema

The procedure/functions can be used in SQL
queries or other SQL statements to perform within
the database computations that cannot be
expressed in SQL query language

Like a simple, general-purpose language

SQL/PSM standard (PSM-96). Commercial
DBMS offers such capabilities with variations (e.g.
Oracle, MS SQL)
329
© D. Wong 2003
2002
PSM Modules

Collection of function and procedure definitions,
temporary relation declarations, and other declarations.
(PSM modules in Oracle are Packages)
PSM Procedure
CREATE PROCEDURE
<name> (parameters)
PSM Function
CREATE FUNCTION
<name> (parameters)
local declarations
return <type>
procedure body;
local declarations
procedure body;

Parameters: mode-name-type triples

Modes: IN, OUT, INOUT
330
© D. Wong 2003
2002
Statement Forms in PSM
1.
CALL –statement (for procedures only)
CALL <procedure name> (<parameter list>);
i.
From a host-language program:
e.g. EXEC SQL CALL proc(:x, 3); // x is a
shared variable
ii. As a statement of another PSM function or
procedure
iii. As an SQL command in an SQL client (e.g.
SQLPLUS). E.g CALL proc (1, 3)
331
© D. Wong 2003
2002
Statement Forms in PSM (continued)
2.
RETURN statement (for functions only)
3.
Declaration of local variables (precede executable
statements):
DECLARE <name> <type>;
4.
Assignment statement:
SET <variable> = <expression>;
5.
Statement group in BEGIN … END block
6.
Statement label -
7.
Branching statement (ref. Fig. 8.9, 8.10 )
labelName:
332
© D. Wong 2003
2002
Statement Forms in PSM (continued 2)
8.
9.
Queries (select-from-where):
Loops
a) LOOP <statement list> END LOOP;
b) Only to iterate over a cursor:
FOR <loop name> AS <cursor name> CURSOR FOR
<query> DO
<statement list>
END FOR
c) WHILE <condition> DO <statement list> END
WHILE;
d) REPEAT <statement list> UNTIL <condition> END
REPEAT;
333
© D. Wong 2003
2002
PL/SQL of Oracle

Oracle’s procedural language (PL). It’s a superset
of SQL.

Used to create stored procedures/functions and
packages, to trigger database events, or to add
programming logic to SQL commands

Ref. : Ullman and students’s introduction
http://www-db.stanford.edu/~ullman/fcdb/oracle/or-plsql.html
334
© D. Wong 2003
2002
Call-level Interface (CLI)

The application is written entirely in the host language.

SQL statements are the values of string variables
constructed at run time

The string variables are passed as arguments to host
language procedures provided by the CLI library of
functions.

No special syntax, no pre-compiler is needed.

System independence – in principle!

SQL 99's CLI standard is an adaptation of ODBC (Open
Database Connectivity) and the host language is C

SQL/CLI in Java is JDBC (discussed in week 9)
335
© D. Wong 2003
2002
The SQL Environment

It's the framework in which data may exist and
SQL operations on data may be executed. I.e. a
DBMS running at an installation.

SQL standard of elements hierarchy: Fig. 8.15
2nd ed
1. Schema – collection of tables, views,
assertions, triggers, PSM, modules, etc.
2. Catalogs – collections of schemas
3. Clusters – collections of catalogs. Each user
has an associated cluster.
336
© D. Wong 2003
2002
The SQL Environment (continued)



SQL clients and servers – processes that runs of the same
or different hosts.
– Server : support operations on database elements
– Client : allow a user to connect to a server and operate
on a database (e.g. SQLPLUS)
Connection – a "connection" must be made between the
server and the client by executing the CONNECT
statement on the client host. Default connection is made by
an SQL client (e.g. SQLPLUS).
Session – SQL operations that are performed while a
connection is active. Each session has a current catalog,
schema, and an authorized user. Reminder: temporary
elements such as view exists only within the environment of
a session.
337
© D. Wong 2003
2002
The SQL Environment (continued 2)

Modules
–
An SQL term for an application program
–
3 kinds of modules in SQL standard:
1.
2.
3.
–
Generic SQL interface, e.g. SQLPLUS
Embedded SQL
True modules, e.g. PSM
SQL agent – a SQL module in execution Ref.
Fig. 8.17 pp. 384
338
© D. Wong 2003
2002

SQL Transactions
– Example in JDBC

Security and User Authorization in SQL

Object Oriented Model
– ODL
– OQL

Object Relational Systems

Recursion in SQL
339
© D. Wong 2003
2002
Transactions in SQL (8.6 - 2nd ed.)


Concerns with enforcing the ACID properties of transactions
Review: ACID properties of “proper” execution:
– Atomicity : All of the updates of a transaction are
successful, or no update take place
– Consistency: Each transaction should leave the database in
a consistent state
– Isolation: Each transaction, when executed concurrently
with other transactions, should have the same effect as if it
had been executed by itself (serializable behavior)
– Durability: Once a transaction has completed successfully,
its changes to the database should be permanent. Even
serious failures should not affect the permanence of a
transaction.
340
© D. Wong 2003
2002
Serializability and Atomicity

Serializable function execution:
If the function executions behave as if they were
run serially, even though their execution may
overlap in time.

Example of problem with concurrent function
executions that are not serialized:
Ex. 8.26, Fig. 8.22 and Fig. 8.23 pp. 398

Example of problem when failure (system,
network, power) occurs during a database
operation:
Ex. 8.27, Fig. 8.24 pp. 400
341
© D. Wong 2003
2002
Transactions

To solve the serializability and atomicity problem.

Def.: A collection of one or more operations on the
database that must be executed atomically, (i.e.
either all are done or none are) and in a serializable
manner (i.e. isolation).

End of a SQL transaction is marked by either
operations:
1. COMMIT - the database will be permanently
updated
2. ROLLBACK – no changes by the transaction
appear in the database (abort the transaction)
342
© D. Wong 2003
2002
Transactions (continued)

In query interface (e.g. SQLPLUS):
Transactions are single queries or modification
statements.

In program interface (e.g. JDBC):
1. Single SQL statement: use automatic commit mode. A
transaction:

starts when the statement begins execution

ends when an automatic commit or rollback is
completed
2. More than 1 SQL statement:
–
Disable auto-commit
–
After all SQL statements in a transaction are
executed, call commit; or rollback if error occurs
343
© D. Wong 2003
2002
Deferred constraints

When the commit method is executed, if there are
deferred constraints that need to be checked, and
these constraints are now violated, then the
transaction is not committed.

Instead, the transaction is rolled back and the
error is reported via the SQLSTATE variable.

In JDBC, this SQLSTATE is reported via the
SQLException object's getSQLState() method.
344
© D. Wong 2003
2002
Code segment to show transaction control in JDBC:
try { // ... add code here for making connection ...
PreparedStatement updateSales = con.prepareStatement("update COFFEES " +
"set SALES = ? where COF_NAME like ?");
PreparedStatement updateTotal = con.prepareStatement( "update COFFEES " +
"set TOTAL = TOTAL + ? where COF_NAME like ?") ;
con.setAutoCommit(false);
updateSales.setInt(1, 50);
updateSales.setString(2, "Colombian");
updateSales.executeUpdate();
updateTotal.setInt(1, 50);
// first update
updateTotal.setString(2, "Colombian");
updateTotal.executeUpdate();
// second update
con.commit();
con.setAutoCommit(true);
} catch(SQLException ex) {
if (con != null) {
try { System.err.print("Error detected! Transaction is being rolled back!");
con.rollback();
} catch(SQLException excep) {
System.err.print("SQLException: " + excep.getMessage());
} } // end if there is a connection
}
345
© D. Wong 2003
2002
Transaction Isolation Levels


Determine whether and how a transaction will be affected
by concurrent transactions – transactions from different
SQL sessions that are accessing the same data.
SQL isolation levels are to address:
1. Dirty Read: a read of dirty data. Dirty data is data
written by a transaction that has not yet committed.
2. Non-repeatable Read: One transaction reads a row.
Another deletes or modifies it and COMMIT before
the first one does. Now, if the first transaction
perform the same read again will get different result.
3. Phantom tuples: tuples that are inserted into the
database by another transaction while one's
transaction is running.
346
© D. Wong 2003
2002
Isolation levels in JDBC

The 5 isolation levels:
1. TRANSACTION_NONE - transactions not supported (This level is
not in the SQL standard)
2. TRANSACTION_READ_UNCOMMITTED - allows dirty reads,
nonrepeatable reads, and phantom reads
3. TRANSACTION_READ_COMMITTED - prevents dirty reads,
but does not prevent nonrepeatable reads, and phantom reads
4. TRANSACTION_REPEATABLE_READ - disallow dirty reads,
nonrepeatable reads, but allow phantom reads
5. TRANSACTION_SERIALIZABLE - disallow dirty reads,
nonrepeatable reads, and phantom reads (The highest level, the
default)

Connection methods to get and set them:
1.
2.
getTransactionIsolation() - to find out the transaction isolation
level the database is set
setTransactionIsolation() - to set the appropriate isolation level
347
© D. Wong 2003
2002
Choosing Isolation Levels

A choice of compromise between performance and
a slightly inconsistent view of data (because data is
being updated concurrently by other users).

Ref. Ex. 8.32, 8.33 pp. 408
348
© D. Wong 2003
2002
DBMS Techniques to enforce ACID

Locking – granularity of locks is important. Locks
are obtained at the beginning of a transaction.
Locks are released at the end of commit or
rollback.

Logging – write a log to nonvolatile storage.
Assure durability.

Transaction Commitment – for durability and
atomicity, transactions are computed
“tentatively”, recorded, but no changes are made
to the db until the transaction gets committed.
Changes are copied to the log, then copied to db.
349
© D. Wong 2003
2002
J2EE Transaction Management in Enterprise JavaBeans (EJBs)
Ref. JavaTM 2 Enterprise Edition Developer's Guide
350
© D. Wong 2003
2002
Container-Managed Transactions (see note page of this
slide)
Description
• Code does not include statements that begin and end the transaction
• Immediately before an EJB method starts - transaction begins
• Just before the method exits - commits
• Each method can be associated with a single transaction
Prohibited methods, e.g.:
• commit, setAutoCommit, and rollback methods of
java.sql.Connection
Limitation:
• When a method is executing, it can be associated with either a single
transaction or no transaction at all
351
© D. Wong 2003
2002
Bean Managed Transaction (see note page of this slide)
• Session bean code invokes methods that mark the boundaries of
the transaction - setAutoCommit(); commit(); rollback();
• An entity bean may not have bean-managed transactions
Example:
public void ship (String productId, String orderId, int quantity) {
try {
con.setAutoCommit(false);
updateOrderItem(productId, orderId);
updateInventory(productId, quantity);
con.commit();
} catch (Exception ex) {
try {
con.rollback();
throw new EJBException("Transaction failed: " + ex.getMessage());
} catch (SQLException sqx) {
throw new EJBException("Rollback failed: " + sqx.getMessage());
}}}
Ref.
JavaTM
352
2 Enterprise Edition Developer's Guide, JDBC Transaction
© D. Wong 2003
2002
Week 15
353
© D. Wong 2003
2002
Security and User Authorization in SQL
8.7 pp. 410

Authorization ID = user name

Special authorization ID: PUBLIC

Privileges for:
SELECT, INSERT, UPDATE, DELETE,
REFERENCE, USAGE, TRIGGER,
EXECUTE, UNDER

For SELECT, INSERT, UPDATE, may also
specify on attribute level

Privileges are needed for relations in the
subqueries also. e.g. Fig. 8.25 pp 411
354
© D. Wong 2003
2002
Creating privileges

Owner of schema or modules has all privileges

Establish ownership at:
1. When a schema is created.
2. When a session is initiated by a CONNECT
statement.
e.g. CONNECT TO ABC_server AS conn1
AUTHORIZATION smith;
3. When a module is created, use an optional
AUTHORIZATION clause
355
© D. Wong 2003
2002
Granting privileges

Owner of a relation has GRANT privilege.

If you have the "GRANT" privilege to a set of privileges, you
may grant them to any user.
GRANT <privilege list> ON <database element>
TO <user list> [WITH GRANT OPTION]
e.g.
GRANT SELECT, INSERT ON Studio TO kirk, picard
WITH GRANT OPTION;
-- by Janeway
GRANT SELECT, INSERT ON Studio TO sisko; -- by picard
GRANT SELECT, INSERT(name) ON Studio TO sisko; -- by kirk

Grant diagram e.g. Fig. 8.26 pp. 417
356
© D. Wong 2003
2002
Revoking Privileges

Privileges can be revoked:
REVOKE [GRANT OPTION FOR] <privilege list> ON <database
element> FROM <user list> {CASCADE | RESTRICT}
e.g.
REVOKE SELECT, INSERT ON Studio FROM picard CASCADE ;

If A has been given a privilege by several different people
on the same element, then all of them have to revoke in
order for A to lose the privilege

If A granted privilege P to B, who granted P to C, then A
revokes P from B will also revoke P from C. e.g. Fig 8.29 pp
420
357
© D. Wong 2003
2002
Object-Oriented Data Model


ODMG
– Object Database Management Group
– Deals with OO standard for database
– Also deals with ORDBMS (Object Relational DBMS)
Major parts of ODMG standard:
– ODL: Object Definition Language, how to specify the
db schema
– OQL: the SQL-like Object Query Language
– Host language binding: how to use ODL and OQL from
within procedural languages. The standard define
bindings for C++, SmallTalk, and Java. In ODMG, the
host language also serves as the object manipulation
language.
358
© D. Wong 2003
2002
ODMG database management system







Application is written in a host language e.g. C++, Java
In order to access the db, the application must be linked with
the ODBMS libraries and with the code that implements its
class methods.
Much of the code that manipulates objects is part of the
database itself.
Each class has a set of methods. Method signatures are
specified in the schema using ODL.
The code for these methods is stored on the database server.
ODBMS invokes the appropriate code whenever a method is
called.
OODMG database data is modified directly in the host
language
e.g. Stud.Name = "Joe";
// Stud contains the oid of a
// persistent Student object
359
© D. Wong 2003
2002
Architecture of an ODMG database
Schema Spec. in
ODL(Embedded in
C++, Java, etc)
ODL
Preprocessor
Source code for class
methods in host language
(C++, Java, …)
Host language
compiler
ODBMS
Software
Method
Implementation
Obj. code
ODBMS
Libraries
Linker
Information stored at the Server
Metadata
Data Access
Method Implementation
Binaries Stored in DBMS
Object Data
360
Ref. "Databases and Transaction Processing" – Lewis, Addison Wesley
© D. Wong 2003
2002
Structure of ODMG Applications
Application source
code in host language
ODBMS
Host language
compiler
ODBMS library
Application
Object code
Method
implementation
binaries stored in
DBMS
Linker
Executable
code
361
Ref. "Databases and Transaction Processing" – Lewis, Addison Wesley
© D. Wong 2003
2002
Object Definition Language (ODL)

Conceptual model to describe the attributes, methods,
and relationships of each object type (class), including it's
inheritance properties.

ODL classes describes 3 kinds of elements:
1. Attributes: values associated with the object
2. Relationship: connection between the object itself and
other objects
3. Methods: functions that may be applied to objects of
the class.
Methods are specified by it's signature: name,
arguments (names, order, and type), return value
type, name of any exceptions it can raise.
e.g. Fig. 4.2 pp137
362
© D. Wong 2003
2002
Object Definition Language (ODL) (continued)

Class declaration

Class include:
1. Class Name
2. Key declaration(s). Optional.
3. Extent Declaration = name for the set of currently
existing objects of a class (I.e. relation instance in
relational model)
4. Element declarations: attributes, relationships,
methods
class <name> [(extent names)]
{ < list of elements> }
363
© D. Wong 2003
2002
Object Definition Language (ODL) (continued 2)

Attribute declaration (non-objects):
attribute <type> <name>;
e.g. 1 attribute string name;
e.g. 2 attribute Struct Addr{ string street, string city}
address;

Relationship (and inverse relationship) declaration
(objects):
relationship [rangetype]<className> <name> inverse
className::<relationship name>;
e.g. relationship Set<Star> stars
inverse Star::starredIn;
364
© D. Wong 2003
2002

Method declaration
<returnType> <methodName> (arguments) raises
(<exception>);
e.g. 1: void lengthInhours() raises (noLengthFound);
e.g. 2: void starName(out Set<String>) ;

Arguments:
in : read-only
out: for returning values
inout: for both
365
© D. Wong 2003
2002
ODL Relationships

Only binary relationships supported
–

Use a connecting class to represent multiway
relationships Fig. 2.9 pp. 34.
Relationships are defined in inverse pairs. Fig.
4.3 pp 140
1. Many-many: have a set type of class in each
direction
2. Many-one: a set type for the one, and a simple
class name for the many
3. One-one: simple class name in both
366
© D. Wong 2003
2002

Subclass (S is a subclass of D)
Class C extends D { class C's declarations }
e.g. class Cartoon extends Movie {
relationship Set<Star> voices;
}

Multiple inheritance (separate the super classes by : in the
extend declaration)
e.g. class CartoonMurderMystery
extends MurderMystery : Cartoon

Name conflict resolutions with Multiple inheritance pp.
151
367
© D. Wong 2003
2002
ODL data types


Basis:
1. Atomic type: integer, float, characters, string,
boolean, enum
2. Class names
Structured types:
1. Set: Set<T> // finite sets of elements of type T
2.
3.
4.
5.
Bag: Bag<T> // finite bags of element type T
List: List<T> // finite lists of 0 or more elements T
Array: Array <T, i> // T = type, i = no. of elements
Dictionary: Dictionary <T, S>, T is key type, S is
range type. Each pair has unique key value.
6. Structures : Struct N {<type1> field1, …}
368
© D. Wong 2003
2002
Keys declaration in ODL

Optional because each object is identified by an internal
OID

May declare one or more keys in the extent declaration
e.g. class Movie
(extent Movies key (title, year))
{
attribute string title;
attribute integer year;
…
}
369
© D. Wong 2003
2002
ODL to Relational Design

Invent a new attribute to serve as key when there
is no key in the ODL design

ODL attributes that are not atomic are converted
into relation attributes that usually are redesigned
with normalization

Methods are not converted to relational design.
But can have methods in Object Relational design
370
© D. Wong 2003
2002
Object-Relational DB (ORDB)



SQL-99 adopted a limited subset of the object relational
model
ORDBMS is a conservative extension to the existing
RDBMS.
In general, ORDB consists of:
– A set of relations (which can be viewed as classes)
– Each relation consists of a set of tuples (which can be
viewed as instances of the class that represents the
relation)
– Each tuple is of the form (oid, val) where oid is an
object id and val is a tuple value whose components can
be arbitrary values (e.g. primitive values, sets of tuples,
and references to other objects)
371
© D. Wong 2003
2002
ORDB, ODB, RDB

Difference between ORDB and ODB
– In ORDB, the top-level structure of each object
instance is always a tuple. In ODB, top-level
structure can be an arbitrary value.

Difference between ORDB and RDB:
– RDB tuple components must be primitive
values
– ORDB tuple components can be arbitrary
values
372
© D. Wong 2003
2002
Oracle Object example
create type ADDRESS_TY as object
(Street
VARCHAR2(50),
City
VARCHAR2(25),
State
CHAR(2),
Zip
NUMBER);
create type PERSON_TY as object
(Name
VARCHAR2(25),
BirthDate DATE;
Address ADDRESS_TY
member function AGE_DAYS (BirthDate IN DATE)
return NUMBER);
373
© D. Wong 2003
2002
Oracle Object example (continued)

Defining methods for user defined types using PL/SQL:
Create type body PERSON_TY as
Member function AGE_DAYS (BirthDate DATE)
return NUMBER is
begin
RETURN ROUND(SysDate – BirthDate);
end;
-- if there are more methods to the data type, may define here
end;
/
374
© D. Wong 2003
2002
Oracle Object example (continued 2)

Create table with user defined abstract data types:
create table CUSTOMER
(Customer_ID NUMBER,
Person
PERSON_TY);

Use constructors for inserting data:
insert into CUSTOMER values (1, PERSON_TY('Joe Smith', '01JAN-90', ADDRESS_TY('10 Spring ST', 'BHM', 'AL', 35110)));

Use path names to access the attributes:
SELECT Person.Address.Street
FROM CUSTOMER;
SELECT Person.AGE_DAYS(Person.BirthDate)
FROM CUSTOMER;
UPDATE CUSTOMER
SET Person.Address.City = 'Birmingham'
WHERE Person.Address.City = 'BHM';
375
© D. Wong 2003
2002
Object-Orient Analysis and Design

Normalization in relational model relates each attribute to
its primary key
e.g. The following is in 3NF:
create table CUSTOMER
(Customer_ID NUMBER,
Name
VARCHAR2(25),
BirthDate DATE;
Street
VARCHAR2(50),
City
VARCHAR2(25),
State
CHAR(2),
ZipNUMBER);


For OO, further group related columns into abstract data
types (ADT) (e.g. ADDRESS_TY) for reuse.
Then look for relationships among ADTs to determine if
nesting is appropriate (e.g. PERSON_TY);
376
© D. Wong 2003
2002