Transcript Lecture 6

Implementation of the
Relational Model
There is no substitute for the comfort supplied by the utterly
taken-for-granted relationship. Iris Murdoch
Class Outline
 What are the required features of a DBMS?
 What are the features of a relational database
management system (RDBMS)?
 What is entity integrity and referential integrity?
 To what extent are entity integrity and referential
integrity supported by MS Access?
 What are the eight algebraic functions supported
by a fully relational DBMS? Give examples of
each.
Functions of a DBMS

Data storage and integrity management - creates complex
structures to store data, forms, etc. and enforces data relationships

Management of data dictionary - updates as database
structure is modified

Data transformation - presents data as user requests
 Backup and Recovery - ensures data safety in case of damage

Multi-user access and Security - allows concurrent use of
database; disallows access to components as determined by the user
 Database Access Languages - supports non-procedural (user
specifies what must be done, not how) query language
 Application Programming Interfaces - supports procedural
languages (e.g., MS Access uses Visual Basic) for programmers
 Communication Interfaces - modern DBMSs provide access to
the database using internet browsers (e.g., Netscape)
Relational Database Management Systems
 Relational database architecture
 Codd, E.F. (1970). A relational model for large shared data
banks. CACM, 13(6), 377-87.
 based on relational algebra and calculus
 first relational prototype - early 1970s - IBM’s System R
 Relational databases required considerable computing
resources (memory, processing speed)



not feasible until mid- 1980s when price-performance ratio
dropped
low end (Access, Paradox, dBase, FoxPro, Clipper, R:Base)
high end (DB2, Oracle, Sybase, MS SQL Server, Informix,
INGRES commercial)
The Relational Model
...consists of relations, which are made up of attributes.
A relation is a set of columns (attributes) with values for
each attribute such that:






Each column (attribute) value must be a single value only.
All values for a given column (attribute) must be of the same
type.
Each column (attribute) name must be unique.
The order of columns is insignificant.
No two rows (tuples) in a relation can be identical.
The order of the rows (tuples) is insignificant.
Steps to Relational Implementation
1. Define the database structure to the DBMS



for server and mainframe databases, use Data Definition
Language (DDL) in a text file that describes columns of
tables, defines indexes, constraints and security restrictions
many PC databases provide a graphical interface to define the
database tables
in both cases, the Database Definition Subsystem of the
DBMS creates the indexes and metadata
2. Allocation of Media Space

usually unnecessary for PC databases, but performance issues
must be considered for server/mainframe dbs
3. Creating the Database Data

import pre-existing data or enter data either through DML
(Data Manipulation Language) or forms of the application
Relational Data Manipulation
 Four strategies for relational data manipulation:
 relational algebra - difficult to use because it is procedural users must specify not only what they want but how to get it
 relational calculus - difficult to learn due to theoretical nature,
not used in commercial database processing
 transform-oriented languages - non-procedural languages
(e.g., SQUARE, SQL, SEQUEL)
 graphical interface to Data Manipulation Language (DML)
 query-by-example and query-by-form (behind each is a
corresponding SQL query) - supported by many PC
RDBMS (Lotus’ Approach, MS Access,Wall Data’s
Cyberprise DBApp)
 application program interface - written in programming
languages such as COBOL, Pascal, Perl, C++
Relational DBMS Defined

Logical database model (rather than physical) that represents all
data as if they are stored in separate two-dimensional but related
tables

Each table consists of single-value data elements describing a
common theme among which is one (or more) elements that
uniquely describe each record in the table (i.e. no two rows are
identical)

Tables are related as long as two tables share a common data
element

Information in these tables can be combined on an as-needed
basis (flexibility) to get answers to queries and generate complex
Product ID
Product Description
Price
Supplier ID
reports
Supplier ID
Supplier Name
801
Shur-Lock U-Lock
75.00
3
802
803
804
805
806
SpeedRite Cyclecomputer
SteelHead Microshell Helmet
SureStop 133-MB Brakes
Diablo ATM Mountain Bike
Ultravision Helmet Mount Mirrors
60.00
40.00
25.00
1,200.00
7.45
3
1
2
3
1
2
3
Bikes-R-Us
Small moter suppliers
All Bikes Allways
Requirements of a RDBMS
1. Enforces Integrity rules
(a) Entity Integrity - every row must have a unique identifier
(primary key) which cannot include null entries
(b) Referential Integrity - foreign key must have either a null entry
or an entry that matches the primary key value in a table to
which it is related
Child Table - Table on the many
side of a one-to-many relationship.
Product ID
801
802
803
804
805
806
Product Description
Shur-Lock U-Lock
SpeedRite Cyclecomputer
SteelHead Microshell Helmet
SureStop 133-MB Brakes
Diablo ATM Mountain Bike
Ultravision Helmet Mount Mirrors
Price
Supplier ID
75.00
3
60.00
3
40.00
25.00
1
1,200.00
2
7.45
3
foreign key
Parent Table - Table on
the one side of a one-tomany relationship.
Supplier ID
Supplier Name
1
Bikes-R-Us
2
Small moter suppliers
3
All Bikes Allways
primary key
Entity Integrity
 Elements of a primary key:








It must uniquely identify each record in the table
It must contain unique values
It cannot be null
It cannot be a multi-part field
It should contain the minimum number of fields necessary to define
uniqueness
It is not optional whole or in part
It must directly identify the value of each field in the table
Its value can only be modified in rare or extreme cases
EmpID FName LName
1
Jane
Smith
2
Bob
Brown
3
Lin
Chow
concatenated primary key
EmpID
1
3
2
1
Date
Prior approval?
12-Sep-99
yes
14-Oct-99
no
14-Oct-99
yes
23-Nov-99
no
Reason
holiday
illness
illness
awl
Referential Integrity

Referential integrity is a mechanism that enforces the ties
between data in separate tables and prevents them from being
broken

Referential integrity minimizes the undesirable likelihood of
the existence of a record in the child table for which there is
no corresponding record in the parent table - referred to as an
orphan (or dangling) record

Prior to setting referential integrity, ensure that

the field used to tie two tables together (the link field) must be a
primary key field in the parent table and a foreign key in the child
table

the link fields have an identical data type

the two tables are in the same database container
Referential Integrity in MS Access

A value cannot be entered in the foreign key field of the related table if that
value doesn't exist in the primary key of the parent table.

A record cannot be
deleted from a
parent table if
matching records
exist in a related
table.

A primary key value
in the parent table
cannot be changed,
if that record has
related records.
Determined by MS
Access on the basis
of primary key
settings.
Referential Integrity Options in MS Access
Cascade Update
Special override of the referential integrity mechanism in order to
be able to edit the primary key in the one table; MS Access will
automatically make the same change to the foreign key in the
child table so the relationship is maintained.
Cascade Delete
Special override of the referential integrity mechanism to
facilitate deleting records in the parent table even when there are
related records in the child table. All related records in the child
table will automatically be deleted so that there will be no orphan
records.
Do not use these options unless you realize the full implications
of making the selection.
Relationship Integrity
is a way of minimizing data errors
MS Access On-line Help
Requirements of a RDBMS
2. Supports many of the relational algebraic
functions - a collection of operations on
relations, resulting in relations
Set theory operators:
 union
 intersect
 difference
 product
Specific relational operators:
 select
 project
 divide
 join
Algebraic function: 1. Union
Combination of data without repeating common rows; must have
equivalent columns as to number and domains (“union compatible”).
Salesperson
Employee ID
Name
Office
27
Rodney JonesToronto
44
Goro Azuma Tokyo
35
Francine Moire
Brussels
37
Anne Abel
Tokyo
Manager
Employee ID
Name
Office
12
Brigit SanchezToronto
99
Mary Chen Brussels
37
Anne Abel
Tokyo
Provide information on all employees regardless of their position:
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma
Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
12
Brigit Sanchez Toronto
99
Mary Chen
Brussels
note that Anne
appears only
once even
though she’s
in both tables
Algebraic function: 2. Intersection
Identification of rows that are common to two relations; must
have equivalent columns as to number and domains.
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma
Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Manager
Employee ID
Name
Office
12
Brigit Sanchez Toronto
99
Mary Chen
Brussels
37
Anne Abel
Tokyo
Provide information on employees who have both a salesperson and
manager role:
Employee ID
Name
37
Anne Abel
Office
Tokyo
Algebraic Function: 3. Difference
Identification of rows that are in one relation and not in another;
must have equivalent columns as to number and domains.
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma
Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Manager
Employee ID
Name
Office
12
Brigit Sanchez Toronto
99
Mary Chen
Brussels
37
Anne Abel
Tokyo
Provide information on employees who have a salesperson role but do
not have a managerial role:
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
Algebraic Function: 4. Product
Adjoining (concatenating) each row in the first relation to each
row in the second relation; must have different column names
Salesperson
Manager
SEmployee ID
SName
SOffice
27
Rodney Jones Toronto
44
Goro Azuma
Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
MEmployee ID
Mname
12
Brigit Sanchez
99
Mary Chen
37
Anne Abel
MOffice
Toronto
Brussels
Tokyo
No obvious query; conceptually important because it is used as a
building block (Cartesian product) for the join operator.
SEmployee ID
SName
27
Rodney Jones
27
Rodney Jones
27
Rodney Jones
44
Goro Azuma
44
Goro Azuma
44
Goro Azuma
35
Francine Moire
35
Francine Moire
35
Francine Moire
37
Anne Abel
37
Anne Abel
37
Anne Abel
SOffice
Toronto
Toronto
Toronto
Tokyo
Tokyo
Tokyo
Brussels
Brussels
Brussels
Tokyo
Tokyo
Tokyo
MEmployee ID
12
99
37
12
99
37
12
99
37
12
99
37
MName
Brigit Sanchez
Mary Chen
Anne Abel
Brigit Sanchez
Mary Chen
Anne Abel
Brigit Sanchez
Mary Chen
Anne Abel
Brigit Sanchez
Mary Chen
Anne Abel
MOffice
Toronto
Brussels
Tokyo
Toronto
Brussels
Tokyo
Toronto
Brussels
Tokyo
Toronto
Brussels
Tokyo
Algebraic Function: 5. Select
Creation of a relation by identifying only rows that satisfy specific
conditions
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma
Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Provide information on
employees who are based in
Tokyo:
Provide information on
employees whose salary is at
least $2000
Salary
3000
2000
2500
1500
Employee ID
Name
44
Goro Azuma
37
Anne Abel
Office
Tokyo
Tokyo
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
Salary
2000
2500
Salary
3000
2000
2500
Algebraic Function: 6. Project
Creates a relation by deleting columns from an existing relation
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Provide a list of employee names (not all information):
Name
Rodney Jones
Goro Azuma
Francine Moire
Anne Abel
Can “nest” (combine) operators (e.g., select, project)
Provide names of employees whose office is in Tokyo:
Name
Goro Azuma
Anne Abel
Algebraic Function: 7. Divide
Creating a new relation by selecting the rows in one relation that
match every row in another relation
Sales
Employee ID
Rodney Jones
Francine Moire
Anne Abel
Anne Abel
Rodney Jones
Anne Abel
Product ID
801
803
802
801
802
803
Product
Who has sold every product?
Name
Anne Abel
Product ID
801
802
803
Algebraic Function: 8. Join
Connection of data across relations: natural join (rows are joined when
common columns have equal values); outer join (all rows from both tables
even if there is no matching column value) and theta join (not covered)
Product
Supplier
Product ID
Product Description
Price Supplier ID
801
Shur-Lock U-Lock
75.00
3
802
SpeedRite Cyclecomputer
60.00
3
803
SteelHead Microshell Helmet
40.00
804
SureStop 133-MB Brakes
25.00
1
805
Diablo ATM Mountain Bike
1,200.00
2
806
Ultravision Helmet Mount Mirrors
7.45
3
Provide the
Supplier Name for
each product
Product ID
801
802
804
805
806
Provide all products and all
suppliers, joining where
possible
Product Description
Shur-Lock U-Lock
SpeedRite Cyclecomputer
SureStop 133-MB Brakes
Diablo ATM Mountain Bike
Ultravision Helmet Mount Mirrors
Supplier ID
Supplier Name
1
Bikes-R-Us
2
Small moter suppliers
3
All Bikes Allways
4
Bikes for Fun
Price
75.00
60.00
25.00
1,200.00
7.45
Supplier ID
3
3
1
2
3
Product ID Product Description
Price Supplier
801
Shur-Lock U-Lock
75.00
3
802
SpeedRite Cyclecomputer
60.00
3
803
SteelHead Microshell Helmet 40.00
804
SureStop 133-MB Brakes
25.00
1
805
Diablo ATM Mountain Bike1,200.00
2
806
Ultravision Helmet Mount Mirrors
7.45
3
4
Supplier Name
All Bikes Allways
All Bikes Allways
Bikes-R-Us
Small moter suppliers
All Bikes Allways
ID Supplier Name
All Bikes Allways
All Bikes Allways
Bikes-R-Us
Small moter suppliers
All Bikes Allways
Bikes for Fun