relational algebra - Department of Computer Science

Download Report

Transcript relational algebra - Department of Computer Science

Translation of ERD to
Relational Scheme,
Relational Algebra
Prof. Sin-Min LEE
Department of Computer
Science
•Relation schema
–Named relation defined by a set of attribute and
domain name pairs.
•Relational database schema
–Set of relation schemas, each with a distinct
name.
•Each tuple is distinct; there are no duplicate tuples.
•Order of attributes has no significance.
•Order of tuples has no significance, theoretically.
•Relation name is distinct from all other relation
names in relational schema.
•Each cell of relation contains exactly one atomic
(single) value.
•Each attribute has a distinct name.
•Values of an attribute are all from the same
domain.
Database Scheme
A relational database scheme, or
schema, corresponds to a set of table
definitions.
Eg: product(p_id, name, category, description)
supply(p_id, s_id, qnty_per_month)
supplier(s_id, name, address, ph#)
* remember the difference between a DB instance
and a DB scheme.
SAMPLE SCHEMAS AND INSTANCES
The Schemas:
Sailors(sid: integer, sname: string, rating: integer, age:
real)
Boats(bid: integer, bname: string, color: string)
Reserves(sid: integer, bid: integer, day: date)
The Instances:
The entity-relationship model
The entity-relationship (ER) data model
was developed by Peter Chen in the
1970s.
As its name implies, the ER model
provides a systematic representation of
the application entities and their
relationships.
ER diagrams provide a notation for
documenting a tentative database design.
The entity-relationship model
features with ER diagrams, which you
then translate into a specific database
schema.
ER notation represents application entities
as rectangles, surrounded by their
attributes. An alternative format lists the
attributes inside the rectangle, with
certain conventions for special kinds of
attributes. For example, a key attribute is
underlined,
The entity-relationship model
and a derived attribute is marked with an
asterisk. ER diagrams express
relationships as diamonds with connecting
lines to the participating entities. The
lines can be single, indicating partial
participation, or double, indicating total
participation of the connected entity.
The entity-relationship model is useful
because, it facilitates communication between
the database designer and the end user
during the requirements analysis. To facilitate
such communication the model has to be
simple and readable for the database
designer as well as the end user. It enables
an abstract global view (that is, the
enterprise conceptual schema) of
the enterprise to be developed without any
consideration of efficiency of the ultimate
database.
The entity-relationship model
A weak entity concept for a situation
where the entity’s objects aren’t
meaningful as isolated in the database. It
is meaningful only when related to a
dominant object, through a many-to-one
or one-to-one relationship. The
participation of a weak entity must be
total.
Class hierarchies and
inheritance
Although the original entity-relationship
model didn’t include class hierarchies and
lattices, later enhancement added these
facilities. These features are useful for
database deployment to an objectoriented setting.
A superclass-subclass arrangement
among the application entities captures
more of the application’s meaning.
Class specialization and
generalization
The process that derives subclasses from
a superclass is specialization.
The process that abstracts from a
collection of specialized subclasses to a
superclass is generalization.
Passing from a superclass to a subclass is
a specialization to a more detailed
description. Rising from a subclass to a
Class specialization and
generalization
superclass is a generalization to a more
generic description.
Subclasses inherit the attributes and
operations of their superclasses. A
subclass specialization can be total,
meaning that you always view a
superclass object as a generalized view of
one or more subclasses. Or it can be
partial, meaning that a
Class specialization and
generalization
superclass object can exist without further
specialization. To indicate total
specialization, you double the connecting
line impinging on the superclass.
Overlapping subclasses and
multiple inheritance
Independent of its total-partial status, a
specialization can be disjoint or
overlapping.
In a disjoint specialization, an object that
is further classified must belong to exactly
one of the subclasses.
Overlapping specialization occurs when an
object belongs simultaneously to two or
Overlapping subclasses and
multiple inheritance
more subclasses.
A thorny issue, which causes difficulties
when mapping an ER diagram to a
database schema, is multiple inheritance.
In multiple inheritance, a class inherits
attributes from several superclasses.
Multiple inheritance and overlapping
specialization frequently occur together
Overlapping subclasses and
multiple inheritance
because the subclasses of the overlapping
specialization can rejoin deeper in the
lattice in a multiple inheritance situation.
Mapping from ER diagrams to
database schemas
You can map an ER diagram to any of the
five database schemas (relational,
network, hierarchical, object-oriented, and
deductive database).
However, the mappings frequently
compromise distinctions documented in
the diagram. ER-to-relational mapping
occurs most frequently.
Mapping from ER diagrams to
a relational schema
Transforming an ER diagram into a relational
database schema proceeds as follows:
- You examine each attribute, x, of each entity,
E, for the following special cases
- If x is a composite attribute, you flatten it.
You retain only the leaf entries of the
hierarchical structure.
- If x is a derived attribute, you remove it.
You can make it available through a view after
Mapping from ER diagrams to
a relational schema
you have established the database schema.
- If x is a multivalued attribute, you create a
new weak entity, E x, and a one-to-many
relationship between E and E x. You move the
multivalued attribute to E x.
- You decompose many-to-many and higherdegree relationships into collections of one-tomany relationships. You transfer any
relationship attributes to the intersection entity.
Mapping from ER diagrams to
a relational schema
- You fold any relationship attributes attached to
one-to-many relationships into the dependent
entity.
- You give key attributes to all dominant entities
if they aren’t already present.
- You give a foreign key to the dependent entity
of each one-to-many relationship; you use the
same name as the key in its dominant partner.
Mapping from ER diagrams to
a relational schema
- You prepare a create-table SQL statement for
each entity of the transformed diagram and use
the appropriate clauses to enforce key
constraints, referential integrity, total
participation, and weak entities.
- Anticipating frequent joins across the
established relationships, you index all primary
and foreign keys--if the database product allows
explicit index creation.
Mapping from ER diagrams to
a relational schema
In summary, you translate an ER diagram into a
relational schema by:
1. Transforming multivalued and composite
attributes.
2. Decomposing many-to-many and higherdegree relationships.
3. Introducing keys and foreign keys. Each
entity then becomes a relation.
Mapping from ER diagrams to
a relational schema
So, if there is no subclass structure, the
mapping proceeds by flattening composite
attributes, removing derived attributes,
exporting multivalued attributes to weak
entities, and factoring all many-to-many and
higher-degree relationships through an
intersection entity. You then transfer
relationship attributes to the intersection entity,
or you fold them into the dependent side of a
one-to-X relationship.
Mapping from ER diagrams to
a relational schema
All dominant entities receive primary keys, and
all dependent entities receive foreign keys of the
same name. Each entity then becomes a
relation.
If subclass specializations are present, you must
remove them before the transformation.
Some methods are available to reduce
superclass-subclass specializations from an ER
diagram before mapping to a relational schema.
Mapping from ER diagrams to
a relational schema
1. The most semantically damaging
approach collapses a specialization into
the superclass, providing slots for all
possible attributes and flags to mark the
suppressed subclasses.
2. A less drastic alternative repeats the
superclass attributes in the subclasses,
but this approach is available only when
the specialization is total.
Mapping from ER diagrams to
a relational schema
3. A final possibility converts subclasses to
full-fledged entities and connects them to
the superclass with restricted
relationships.
- All three techniques share the
disadvantage of masking semantic
nuances in the ER diagram.
What is Relational
Algebra?
 Relational algebra is a procedural query language.
It consists of the select, project, union, set
difference, Cartesian product, and rename
operations.
Set intersection, division, natural join, and
assignment combine the fundamental
operations.
SQL is based on relational algebra
Query languages
procedural vs. non-procedural
commercial languages have some of
both
we will study:
relational algebra (which is procedural, i.e.
tells you how to process a query)
relational calculus (which is non-procedural
i.e. tells what you want)
Introduction
one of the two formal query languages of the
relational model
collection of operators for manipulating
relations
Operators: two types of operators
 Set Operators: Union(),Intersection(),
Difference(-), Cartesian Product (x)
New Operators: Select (), Project (), Join (⋈)
Introduction – cont’d
A Relational Algebra Expression: a sequence
of relational algebra operators and operands
(relations), formed according to a set of rules.
The result of evaluating a relational algebra
expression is a relation.
Selection
Denoted by c(R)
Selects the tuples (rows) from a relation R that
satisfy a certain selection condition c.
It is a unary operator
The resulting relation has the same attributes
as those in R.
Example 1:
S:
SNO
SNAME
AGE
STATE
S1
MIKE
21
IL
S2
STEVE
20
LA
S3
MARY
18
CA
S4
MING
19
NY
S5
OLGA
21
NY

state=‘IL’(S)
Example 2:
CREDIT  3(C)

C:
CNO
CNAME
CREDIT DEPT
C1
Databas
e
3
CS
C2
Statistic
s
3
MATH
C3
Tennis
1
SPORTS
C4
Violin
4
MUSIC
C5
Golf
2
SPORTS
C6
Piano
5
MUSIC
Example 3
E:
SNO=‘S1’and CNO=‘C1’(E)
SNO
CNO
Grade
S1
C1
90
S1
C2
80
S1
C3
75
S1
C4
70
S1
C5
100
S1
C6
60
S2
C1
90
S2
C2
80
S3
C2
90
S4
C2
80
S4
C4
85
S4
C5
100
Selection - Properties
Selection Operator is commutative
C1(C2 (R)) = C2(C1 (R))
The Selection is an unary operator, it cannot
be used to select tuples from more than one
relations.
Projection
 Denoted by L(R), where L is list of attribute names and R is a
relation name or some other relational algebra expression.
 The resulting relation has only those attributes of R specified
in L.
 The projection is also an unary operation.
 Duplicate rows are not permitted in relational
algebra. Duplication is removed from the result.
 Duplicate rows can occur in SQL, though they
may be controlled by explicit keywords.
Projection - Example
Example 1: STATE (S)
SNO
SNAME
AGE
STATE
S1
MIKE
21
IL
S2
STEVE
20
LA
S3
MARY
18
CA
S4
MING
19
NY
S5
OLGA
21
NY
STATE
IL
LA
CA
NY
Projection - Example
Example 2: CNAME, DEPT(C)
CNO
CNAME
CREDIT DEPT
CNAME
DEPT
C1
Databas
e
3
CS
Databas
e
CS
C2
Statistic
s
3
MATH
Statistic
s
MATH
C3
Tennis
1
SPORTS
Tennis
SPORTS
C4
Violin
4
MUSIC
Violin
MUSIC
C5
Golf
2
SPORTS
Golf
SPORTS
C6
Piano
5
MUSIC
Piano
MUSIC
Projection - Example
Example 3: S#(STATE=‘NY'(S))
SNO
SNAME
AGE
STATE
S1
MIKE
21
IL
S2
STEVE
20
LA
S3
MARY
18
CA
S4
S4
MING
19
NY
S5
S5
OLGA
21
NY
SNO
SET Operations
UNION: R1  R2
INTERSECTION: R1  R2
DIFFERENCE: R1 - R2
CARTESIAN PRODUCT: R1  R2
Union Compatibility
For operators , , -, the operand relations
R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn)
must have the same number of attributes, and
the domains of the corresponding attributes
must be compatible; that is, dom(Ai)=dom(Bi)
for i=1,2,...,n.
The resulting relation for , , or - has the
same attribute names as the first operand
relation R1 (by convention).
Union Compatibility Examples
Are S(SNO, SNAME, AGE, STATE) and
C(CNO, CNAME, CREDIT, DEPT) union
compatible?
Are S(S#, SNAME, AGE, STATE) and
C(CNO,
CNAME,
CREDIT_HOURS,
DEPT_NAME) union compatible?
UNION, SET DIFFERENCE
& SET INTERSECT
 Union puts all tuples of two relations in one relation. To use this
operator, two conditions must hold:
1. The two relations must be of the same arity.
2. The domain of ith attribute of the two participating relation
must be the same.
 Set difference operator computes tuples that are in one relation,
but not in another.
 Set intersect operator computes tuples that are common in two
relations:
 The five fundamental operations of the relational algebra are:
select, project, cartesian product, Union, and set difference
 All other operators can be constructed using these operators
EXAMPLE
Assume a database with the following
three relations:
Sailors (sid, sname, rating)
Boats (bid, bname, color)
Reserve (sid, bid, date)
Query 1: Find the bid of red colored
boats:
EXAMPLE
Assume a database with the following three
relations:
Sailors (sid, sname, rating)
Boats (bid, bname, color)
Reserve (sid, bid, date)
Query 1: Find the bid of red colored boats:
∏bid(бcolor=red(Boats))
EXAMPLE
Assume a database with the following
three relations:
Sailors (sid, sname, rating)
Boats (bid, bname, color)
Reserve (sid, bid, date)
Query 1: Find the name of sailors who
have reserved Boat number 2.
EXAMPLE
Assume a database with the following three
relations:
Sailors (sid, sname, rating)
Boats (bid, bname, color)
Reserve (sid, bid, date)
Query 1: Find the name of sailors who have
reserved Boat number 2.
∏sname(бbid=2(Sailors
(sid)Reserve))
EXAMPLE
Assume a database with the following three
relations:
Sailors (sid, sname, rating)
Boats (bid, bname, color)
Reserve (sid, bid, date)
Query 1: Find the name of sailors who have
reserved both a red and a green boat.
Union, Intersection,
Difference
T= R U S : A tuple t is in relation T if and only
if t is in relation R or t is in relation S
T = R  S: A tuple t is in relation T if and only
if t is in both relations R and S
T= R - S :A tuple t is in relation T if and only
if t is in R but not in S
Set-Intersection
Denoted by the symbol .
 Results in a relation that contains only the
tuples that appear in both relations.
R  S = R – (R – S)
Since set-intersection can be written in
terms of set-difference, it is not a
fundamental operation.
Examples
R
S
A1
A2
1
Red
3
White
4
green
B1
B2
3
White
2
Blue
Examples
RS
A1
1
3
4
2
A2
Red
White
Green
Blue
S-R
B1
B2
2
Blue
R S
A1
A2
3
White
R-S
A1
1
4
A2
Red
Green
RENAME OPERATOR
Rename operator changes the name of its
input table to its subscript,
ρe2(Emp)
Changes the name of Emp table to e2
EXAMPLE
Emp table:
SS#
1
2
3
4
5
Name
Joe
Mary
Bob
Kathy
Shideh
Age
24
20
22
30
4
Salary
20000
25000
27000
30000
4000
dno
2
3
4
5
1
EXAMPLE
Emp table:
SS#
1
2
3
4
5
Name
Joe
Mary
Bob
Kathy
Shideh
Age
24
20
22
30
4
бSalary=30,000(Employee)
Salary
20000
25000
27000
30000
4000
dno
2
3
4
5
1
EXAMPLE
Emp table:
SS#
1
2
3
4
5
Name
Joe
Mary
Bob
Kathy
Shideh
SS#
4
Name
Kathy
Age
24
20
22
30
4
Age
30
бSalary=30,000(Employee)
Salary
20000
25000
27000
30000
4000
Salary
30000
dno
2
3
4
5
1
dno
5
EXAMPLE
Emp table:
SS#
1
2
3
4
5
Name
Joe
Mary
Bob
Kathy
Shideh
Age
24
20
22
30
4
бAge>22(Employee)
Salary
20000
25000
27000
30000
4000
dno
2
3
4
5
1
EXAMPLE
Emp table:
SS#
1
2
3
4
5
SS#
1
4
Name
Joe
Mary
Bob
Kathy
Shideh
Age
24
20
22
30
4
Name
Joe
Kathy
бAge>22(Employee)
Age
24
30
Salary
20000
25000
27000
30000
4000
Salary
20000
30000
dno
2
3
4
5
1
dno
2
5
PROJECT OPERATOR
Project (∏) retrieves a column. It is a unary
operator that eliminate duplicates.
e.g., name of employees:
∏ name(Employee)
e.g., name of employees earning more than
30,000:
∏ name(бSalary>30,000(Employee))
EXAMPLE
Emp table:
SS#
1
2
3
4
5
Name
Joe
Mary
Bob
Kathy
Shideh
Age
24
20
22
30
4
Salary
20000
25000
27000
30000
4000
dno
2
3
4
5
1
EXAMPLE
Emp table:
SS#
1
2
3
4
5
∏
Name
Joe
Mary
Bob
Kathy
Shideh
age(Emp)
Age
24
20
22
30
4
Salary
20000
25000
27000
30000
4000
dno
2
3
4
5
1
Age
24
20
22
30
4
EXAMPLE
Emp table:
SS#
1
2
3
4
5
∏
Name
Joe
Mary
Bob
Kathy
Shideh
б
Age
24
20
22
30
4
name,age( Salary=4000
Salary
20000
25000
27000
30000
4000
(Emp) )
dno
2
3
4
5
1
EXAMPLE
Emp table:
SS#
1
2
3
4
5
Name
Joe
Mary
Bob
Kathy
Shideh
SS#
5
∏
Name
Shideh
name,age(бSalary=4000
Age
24
20
22
30
4
Age
4
(Emp) )
Salary
20000
25000
27000
30000
4000
Salary
4000
dno
2
3
4
5
1
dno
1
EXAMPLE
Emp table:
SS#
1
2
3
4
5
Name
Joe
Mary
Bob
Kathy
Shideh
Age
24
20
22
30
4
Name
Shideh
∏
б
name,age( Salary=4000
Salary
20000
25000
27000
30000
4000
Age
4
(Emp) )
dno
2
3
4
5
1