Relational Algebra and Calculus

Download Report

Transcript Relational Algebra and Calculus

Relational Algebra
Relational Model: Topic 3
Introduction to Database Systems
1
Preliminaries

A query is applied to relation instances, and the
result of a query is also a relation instance.
– Schemas of input relations for a query are fixed (but
query will run regardless of instance!)
– The schema for the result of a given query is also
fixed! Determined by definition of query language
constructs.
Introduction to Database Systems
2
Example Instances

R1 sid
22
58
“Sailors” and “Reserves”
sid
S1
relations for our examples.
22
31
58
S2 sid
28
31
44
58
Introduction to Database Systems
bid
day
101 10/10/96
103 11/12/96
sname rating age
dustin
7
45.0
lubber
8
55.5
rusty
10 35.0
sname rating age
yuppy
9
35.0
lubber
8
55.5
guppy
5
35.0
rusty
10 35.0
3
Relational Algebra

Basic operations:
–
–
–
–
–

Selection () Selects a subset of rows from relation.
Projection ( ) Deletes unwanted columns from relation.
Cross-product ( ) Allows us to combine two relations.
Set-difference ( ) Tuples in reln. 1, but not in reln. 2.
Union (  ) Tuples in reln. 1 and in reln. 2.


Additional operations:
– Intersection, join, division, renaming: Not essential, but
(very!) useful.

Operations can be composed! (Algebra is “closed”.)
Introduction to Database Systems
4
Projection



Deletes attributes that are not in
projection list.
Schema of result?
Duplicates!
sname
rating
yuppy
lubber
guppy
rusty
9
8
5
10
 sname,rating(S2)
age
35.0
55.5
 age(S2)
Introduction to Database Systems
5
Selection
sid sname rating age
28 yuppy 9
35.0
58 rusty
10
35.0
 rating 8(S2)



Selects rows that satisfy
selection condition.
Duplicates?
Schema of result?
sname rating
yuppy 9
rusty
10
 sname,rating( rating 8(S2))
Introduction to Database Systems
6
Union, Intersection, Set-Difference
sid sname rating age



Inputs: union-compatible:
– Same number/type of
fields.
Schema of result?
Duplicates?
sid sname rating age
22 dustin 7
45.0
S1 S2
Introduction to Database Systems
22
31
58
44
28
dustin
lubber
rusty
guppy
yuppy
7
8
10
5
9
45.0
55.5
35.0
35.0
35.0
S1 S2
sid sname rating age
31 lubber 8
55.5
58 rusty
10
35.0
S1 S2
7
Cross-Product
Each row of S1 is paired with each row of R1.
 Result schema has one field per field of S1 and R1,
with field names `inherited’ if possible.

(sid) sname rating age
(sid) bid day
22
dustin
7
45.0
22
101 10/10/96
22
dustin
7
45.0
58
103 11/12/96
31
lubber
8
55.5
22
101 10/10/96
31
lubber
8
55.5
58
103 11/12/96
58
rusty
10
35.0
22
101 10/10/96
58
rusty
10
35.0
58
103 11/12/96
 Renaming operator:
Introduction to Database Systems
 (C(1 sid1, 5 sid2), S1 R1)
8
Joins

Condition Join:
R  c S   c ( R  S)
(sid) sname rating age
22
dustin 7
45.0
31
lubber 8
55.5
S1 
(sid) bid
58
103
58
103
S1.sid  R1.sid
day
11/12/96
11/12/96
R1
Result schema ?
 Fewer tuples than cross-product, might be
able to compute more efficiently

Introduction to Database Systems
9
Joins

Equi-Join: A special case of condition join where
the condition c contains only equalities.
sid
sname rating age bid day
22
dustin 7
45.0 101 10/10/96
58
rusty
10
35.0 103 11/12/96
S1 


R1
sid
Result schema similar to cross-product, but only
one copy of fields for which equality is specified.
Natural Join: Equijoin on all common fields.
Introduction to Database Systems
10
Division
Not supported as a primitive operator, but useful for
expressing queries like:
Find sailors who have reserved all boats.
 Let A have 2 fields, x and y; B have only field y:
– A/B =  x |  x, y  A  y  B

– i.e., A/B contains all x tuples (sailors) such that for every y
tuple (boat) in B, there is an xy tuple in A.
– Or: If the set of y values (boats) associated with an x value
(sailor) in A contains all y values in B, the x value is in A/B.

In general, x and y can be any lists of fields; y is the
list of fields in B, and x  y is the list of fields of A.
Introduction to Database Systems
11
Examples of Division A/B
sno
s1
s1
s1
s1
s2
s2
s3
s4
s4
pno
p1
p2
p3
p4
p1
p2
p2
p2
p4
A
Introduction to Database Systems
pno
p2
B1
pno
p2
p4
B2
pno
p1
p2
p4
B3
sno
s1
s2
s3
s4
sno
s1
s4
sno
s1
A/B1
A/B2
A/B3
12
Expressing A/B Using Basic Operators
Division is not essential operator; just a useful
shorthand.
 Idea: For A/B, compute all x values that are not
`disqualified’ by some y value in B.

– x value is disqualified if by attaching y value from B, we
obtain an xy tuple that is not in A.
Disqualified x values:
A/B:
 x ( A) 
Introduction to Database Systems
 x (( x ( A)  B)  A)
all disqualified x values
13
Find names of sailors who’ve reserved boat #103

Solution 1:

Solution 2:
 sname ((
bid 103
 (Temp1, 
Re serves)  Sailors)
bid  103
Re serves)
 ( Temp2, Temp1  Sailors)
 sname (Temp2)

Solution 3:
 sname (
Introduction to Database Systems
bid 103
(Re serves  Sailors))
14
Find names of sailors who’ve reserved a red boat
Information about boat color only available in
Boats; so need an extra join:
 sname ((
Boats)  Re serves  Sailors)

color ' red '

A more efficient solution:
 sname (
sid
((

bid color ' red '
Boats)  Re s)  Sailors)
 A query optimizer can find this
Introduction to Database Systems
15
Find sailors who’ve reserved a red or a green boat

Can identify all red or green boats, then find
sailors who’ve reserved one of these boats:
 (Tempboats, (
color ' red '  color ' green '
Boats))
 sname(Tempboats  Re serves  Sailors)

Can also define Tempboats using union! (How?)

What happens if  is replaced by  in this query?
Introduction to Database Systems
16
Find sailors who’ve reserved a red and a green boat

Previous approach won’t work! Must identify
sailors who’ve reserved red boats, sailors
who’ve reserved green boats, then find the
intersection (note that sid is a key for Sailors):
 (Tempred, 
sid
 (Tempgreen, 
((
sid
color ' red '
((
Boats)  Re serves))
color ' green'
Boats)  Re serves))
 sname((Tempred  Tempgreen)  Sailors)
Introduction to Database Systems
17
Find the names of sailors who’ve reserved all boats

Uses division; schemas of the input relations
to / must be carefully chosen:
 (Tempsids, (
sid, bid
Re serves) / (
bid
Boats))
 sname (Tempsids  Sailors)

To find sailors who’ve reserved all ‘Interlake’ boats:
.....
/
bid
(
Introduction to Database Systems
bname ' Interlake'
Boats)
18
Expressive Power
Codd’s Theorem: Every Relational Algebra query
can be expressed as a safe query in DRC / TRC;
the converse is also true.
 Relational Completeness: A “relationally complete”
query language (e.g., SQL) can express every
query that is expressible in relational
algebra/calculus.

Introduction to Database Systems
19
A Limitation of the Algebra
For any particular instance of Edges, there is
an R.A. expression to compute transitive
closure. (What is it?)
 There’s no R.A. for transitive closure of an
arbitrary instance of Edges. (Why?)

Edges
From
a
a
c
y
To
b
c
y
z
Introduction to Database Systems
a
b
z
c
y
20