Mapping ER to Relational Model

Download Report

Transcript Mapping ER to Relational Model

The Relational Model
Mapping the ER Model to a
Database Implementation
1
The Relational Model
• Mathematically based
• Can develop theoretical design improvements &
enhancements that result in applications to
many different applications
• Can use exact mathematical notation
• Basic structure is simple, easy to understand
– Separates logical from physical level
– Operations do not require user to know storage
structures used
– Data operations easy to express, using a few
powerful commands
2
Tables
• Relations are represented logically as
tables
– Tables are related to one another
– Table holds information about entities
• Table rows correspond to individual
records
• Table columns correspond to attributes
– A column contains values from one domain
– Domains consist of atomic (single) values
3
Properties of Tables
• Each cell contains at most one value
– It is a single piece of data
• Each column has a distinct name
– This is the name of the attribute it represents
• Values in a column all come from the
same domain
• Each tuple is distinct – no duplicate tuples
4
Sample: ER to Relational Model
Enrolls
STUDENT
Taught by
CLASS
Student
Enroll
Class
stuId
lastName
firstName
major
credits
stuId
S1001
Smith
Tom
History
90
S1001
ART103A
A
S1002
Chin
Ann
Math
36
S1001
HST205A
C
S1005
Lee
Perry
History
3
S1002
ART103A
D
S1010
Burns
Edward
Art
63
S1002
CSC201A
F
S1013
McCarthy
Owen
Math
0
S1002
MTH103C
B
S1015
Jones
Mary
Math
42
S1010
S1020
Rivera
Jane
CSC
15
Faculty
facId
name
department
rank
F101
Adams
Art
Professor
F105
Tanaka
CSC
Instructor
F110
Byrne
Math
Assistant
F115
Smith
History
Associate
F221
Smith
CSC
Professor
classNum
FACULTY
grade
classNum
facId
schedule
room
ART103A
F101
MWF9
H221
CSC201A
F105
TuThF10
M110
CSC203A
F105
MThF12
M110
HST205A
F115
MWF11
H221
ART103A
MTH101B
F110
MTuTh9
H225
S1010
MTH103C
MTH103C
F110
MWF11
H225
S1020
CSC201A
B
S1020
MTH101B
A
Student (stuId, lastName, firstName, major,
credits)
Class (classNum, facId, schedule, room)
Faculty (facId, name, department, rank)
Enroll(stuId,classNum,grade)
5
Representing Relational
Database Schemas
• Can have any number of relation schemas
• Example: University database schema
Student (stuId, lastName, firstName, major, credits)
Class (classNumber, facId, schedule, room)
Faculty (facId, name, department, rank)
Enroll(stuId,classNumber,grade)
• This could also be represented by the
relationships screen of the DBMS.
6
Properties of Relations (Tables)
• Degree: the number of attributes
– Binary, ternary, n-ary
• Connectivity:
– 1:1, 1:N, M:N
• Cardinality: the number of tuples
– Changes as tuples are added or deleted
• Keys
• Constraints
7
Keys
• Relations never have duplicate tuples (rows)
– You can always tell tuples apart / there is always a key
• Superkey: set of attributes that uniquely identifies
tuples
• Candidate key: minimal superkey
– No proper subset of itself is also a superkey
• Primary key (PK): candidate key chosen to
uniquely identify tuples
– You cannot verify a key by looking at an instance – why?
• Foreign key (FK) is an attribute or combination of
attributes of a relation that is the PK of another
relation
8
Selecting the Primary Key
• An ideal primary key is short, numeric, and
seldom changing
• If there is more than one candidate key,
each should be carefully evaluated
• If the entity has no identifier, some
attribute must be selected as the PK
– In some situations, a surrogate key may be
defined
9
Surrogate Keys
• These are unique, DBMS-supplied identifiers used
as the PK of a relation
• The values of a surrogate key have no meaning to
users and are normally hidden on forms and
reports
• The DBMS does not allow the value of a surrogate
key to be changed
• Disadvantages:
– FK’s based on surrogate keys have no meaning to users
– When data shared among different databases contain the
same ID, merging those tables might yield unexpected
results
10
Constraints
• Integrity constraints – to ensure “correctness and internal
consistency”
– Rules or restrictions that apply to all instances of the database
– Enforcing them ensures only legal states of the database are
created
• Types of integrity constraints
– Domain constraint - limits set of values for an attribute
– Entity integrity - no part of a PK can be null
– Referential integrity - each FK value must match the primary
key value of some tuple in its related relation, or be null
• General constraints are the business rules
– These may be expressed as table constraints or assertions
• Participation constraints reflect the extent of entities’
involvement in given relationships
11
The Database Implementation Process
• This is the step in the database design
process that follows the conceptual
design/ERD
– Create tables and columns from entities and
attributes
– Select primary keys
– Represent relationships
– Specify constraints
– Performance tuning
12
Mapping the ERD to a Relational Model
• Entities
– Issues with composite & multi-valued attributes
– Issues with weak entities
• Relationships
– 1:1, 1:N and M:N
– Participation constraints
13
Mapping an Entity to a Table
• Each entity maps onto a table
– Its non-composite, single-valued attributes comprise
the table’s column headings
– For composite attributes, there are 3 possible ways:
• Make the composite into a single attribute
• Create several individual attributes to replace the composite
• Create a new entity
– For multi-valued attributes, we create a new table
• The PK of this new table is the composite of the original
attribute coupled with the PK of the original table
• This new table is created as a weak entity
– Example: multiple email accounts
14
Mapping a Relationship
• Binary Relationships:
– 1:M
• PK of 1-side becomes a FK of the M-side table
– 1:1
• First, make sure they are not the same entity. If not, use
either PK as the FK in the other table
– M:M
• Create a relationship table (bridge, composite) with a
composite PK consisting of the PK’s of the related entities,
along with any relationship attributes
• Ternary or higher degree relationships: construct
relationship table of keys, along with any relationship
attributes
• With all relationships, we must preserve referential
integrity, participation & cardinality constraints
15
1:1 Relationship
EMPLOYEE
1
Has
1
AUTO
16
1:N Relationship
FACULTY
1
N
Teaches
Class
Faculty
facId
name
department
rank
F101
Adams
Art
Professor
F105
F110
Tanaka
Byrne
CSC
Math
CLASS
classNum
facId
schedule
room
ART103A
F101
MWF9
H221
CSC201A
F105
TuThF10
M110
CSC203A
F105
MThF12
M110
HST205A
F115
MWF11
H221
MTH101B
F110
MTuTh9
H225
MTH103C
F110
MWF11
H225
Instructor
Assistant
F115
Smith
History
Associate
F221
Smith
CSC
Professor
17
M:N Relationship
Enrolls
STUDENT
CLASS
Student
Enroll
classNum
Class
stuId
lastName
firstName
major
credits
stuId
grade
S1001
Smith
Tom
History
90
S1001
ART103A
A
S1002
Chin
Ann
Math
36
S1001
HST205A
C
S1005
Lee
Perry
History
3
S1002
ART103A
D
S1010
Burns
Edward
Art
63
S1002
CSC201A
F
S1013
McCarthy
Owen
Math
0
S1002
MTH103C
B
S1015
Jones
Mary
Math
42
S1010
S1020
Rivera
Jane
CSC
15
classNum
facId
schedule
room
ART103A
F101
MWF9
H221
CSC201A
F105
TuThF10
M110
CSC203A
F105
MThF12
M110
HST205A
F115
MWF11
H221
ART103A
MTH101B
F110
MTuTh9
H225
S1010
MTH103C
MTH103C
F110
MWF11
H225
S1020
CSC201A
B
S1020
MTH101B
A
Student (stuId, lastName, firstName, major, credits)
Enroll(stuId,classNum,grade)
Class (classNum, facId, schedule, room)
18
Weak entities
• Weak entities become tables by adding
the PK of the parent (strong) entity
– What is the primary key in such a table?
• What about the total participation (and
existence dependence) constraint?
– What do we need to ensure?
– How do we do this?
19
RDBMS Rules Regarding FK’s
• Choices made when relationships are
established between tables
– Cascade deletes
• If parent instance is deleted, so are all its children
– Restrict deletes
• Cannot delete a parent instance if it has a child
– Set to NULL
• If parent instance is deleted, set the FK of the child to NULL
– Cascade updates
• If parent instance gets new PK, change the FK of all its
children
20
Total participation
• Referential integrity actions need to be
specified to ensure that
– When the parent entity instance is deleted,
the weak entity instance is deleted as well
– Each new weak entity instance must have a
parent instance with which to connect
• Other participation situations also require
careful attention.
– They must be reasoned out.
21
Weak Entity Example
22
Enforcing Minimum Cardinality
• If the minimum cardinality on the child is 1, at least one
child row must be connected to the parent
• A required parent can be specified by making the foreign
key value NOT NULL
• A required child can be represented by creating update
and delete referential integrity actions on the child and
insert referential integrity actions on the parent
• Such referential integrity actions must be declared during
database design and trigger codes must be written
during implementation
23
Subtype Relationship
24
Entity Supertypes and Subtypes
• Generalization hierarchy
– Depicts a relationship between a higherlevel supertype entity and a lower-level
subtype entity
• Supertype entity
– Contains shared attributes
• Subtype entity
– Contains unique attributes
25
Nulls Created with No
Supertypes/Subtypes
26
A Generalization Hierarchy
27
Disjoint Subtypes
• Also known as non-overlapping subtypes
– Subtypes that contain a subset of the
supertype entity set
– Each entity instance (row) of the supertype
can appear in only one of the disjoint
subtypes
• Supertype and its subtype(s) maintain a
1:1 relationship
– So, how is it implemented in an rdbms?
28
EMPLOYEE/PILOT
Supertype/Subtype Relationship
29
A Generalization Hierarchy
with Overlapping Subtypes
30
Recursive Relationships
• A recursive relationship is a relationship among
entities of the same class
• For 1:1 and 1:N recursive relationships, add a
foreign key to the relation that represents the
entity
31
1:1 Recursive Relationships
32
1:N Recursive Relationships
33
Some issues with conceptual
design using the ER model
• Design choices:
–
–
–
–
–
•
Should a concept be modeled as an entity or an attribute?
Should a concept be modeled as an entity or a
relationship?
Identifying relationships: Binary or ternary?
When to allow NULL values?
How should privacy concerns/sensitive data collection be
handled?
Constraints in the ER Model:
–
–
A lot of data semantics can (and should) be captured.
But some constraints cannot be captured in ER diagrams.
34
Entity vs. Attribute
• Should address be an attribute of EMPLOYEE or
an entity (connected to EMPLOYEE by a
relationship)?
• Depends upon the use we want to make of
address data, and the semantics of the data:
– If we have several addresses per employee, address
must be an entity
– If the structure (city, street, etc.) is important, e.g., we
want to retrieve employees in a given city, address can
be modeled as an entity or as several attributes
35
Entity vs. Attribute example
• Works_In does not allow
an employee to work in a
department for two or
more periods.
• Similar to the problem of
wanting to record several
addresses for an
employee: We want to
record several values of
the descriptive attributes
for each instance of this
relationship.
• Accomplished by
introducing new entity set,
Duration.
eid
to
from
name
dname
lot
did
Works_In
Employees
budget
Departments
name
dname
eid
lot
Employees
from
did
Works_In
Duration
budget
Departments
to
36
Entity vs. Relationship
since
name
• First ER diagram OK if a
manager gets a separate
discretionary budget for
each dept.
eid
dbudget
lot
Employee
dname
did
budget
Department
Manages
name
• What if a manager gets a
discretionary budget that
covers all managed depts?
Could
also be a
supertyp
e
eid
lot
dname
since
did
Employee
Manages
budget
Department
Is a
Manager
dbudget
37
Binary vs. Ternary Relationships
name
eid
Employee
What are the
differences
between these
two diagrams?
pname
lot
Covers
Design 1
age
Dependent
Policy
policyid
cost
name
pname
eid
lot
age
Dependent
Employee
Purchaser
Beneficiary
Design 2
Policy
policyid
cost
38
Binary vs. Ternary Relationships
• An example where a ternary relation is
required
– Contracts relates entity sets PART, DEPT and
SUPPLIER, and has attribute qty.
• No combination of binary relationships is an
adequate substitute:
• S “can-supply” P, D “needs” P, and D “deals-with” S
does not imply that D has agreed to buy P from S.
• And, furthermore, how would we record qty?
39
Null values
• A null value is an attribute value that has not been
supplied
• Null values are ambiguous as they can mean
– The value is unknown
– The value is inappropriate
– The value is known to be blank
• Inappropriate nulls can be avoided by
– Defining subtype or category entities
– Forcing attribute values through the use of not null
– Supplying initial values
• Ignore nulls if the ambiguity is not a problem to the users
40
Surrogate Key Example
41