Transcript name

Database Systems
(資料庫系統)
September 24, 2014
Lecture #2
1
Course Administration
• http://www.csie.ntu.edu.tw/~winston/courses/dms/
• HW #1 will be on the course homepage now
– It is due on Oct 8.
• Next week reading:
– R&G Chapters 3 & 4.1~4.2
2
TA Updates
• TAs
– 謝朋儒, Room 505, Monday 1:10~2:10 pm,
email: [email protected]
– 張人尹, Room 506, Tuesday 1:10~2:10 pm,
email: [email protected]
3
Possible Layered Architecture to
DMS
SELECT S.name
FROM Students S
WHERE S.sid = 123456
Applications
Queries (SQL)
Query Optimization
and Execution
Relational Operators
These layers
must consider
concurrency
control and
crash recovery
Files and Access Methods
Buffer Management
Disk Space Management
4
Chapter 2
Introduction to Database Design
5
Scenario
• Say if you are hired by iBeer Retailer as a computer
consultant.
• iBeer wants you to design its database system.
• How to design it?
6
Database Design
• Step 1: Requirements Analysis
– What application (e.g., queries, updates, ..) needs from the database?
– What data to store in the database?
– What operations are most frequent and subject to performance req.
• Step 2: Conceptual Database Design
– Data to be stored and the constraints
– Come up with the design: Entity-Relation (ER) model
– Sketch the design using pictures called entity-relationship diagrams.
• Step 3: Logical Database Design
– Implement the design: relational data model
– Easy to map ER diagrams into the relational data model (CH 3).
7
Requirement Analysis
• Requirement analysis:
– The Beer retailer wants to keep track of
• Beers on shelves
• Beer manufacturers: [name & address]
• Conceptual database design
– ER diagram
name
name
Beers
• Logical database design:
ManfBy
addr
Manfs
– Relational model
Manufacturer’s
names
Manufacturer’s
addresses
Beer names
Manufacturer's
names
台北市南昌路一
段4號
台灣啤酒
台灣菸酒公賣局
青島啤酒
台灣菸酒公賣
局
台灣生啤酒
台灣菸酒公賣局
台灣生啤酒
青島啤酒廠
??
青島啤酒
青島啤酒廠
Beer names
台灣啤酒
8
ER Model: Entity
(123: integer)
ssn
name
Employees
(‘Joe’: string)
(Joe, Alice, ..)
• Proposed by Peter Chen (BS NTU EE ‘68) in 1976.
• Entity: A real-world object distinguishable from other objects (e.g.,
Joe).
• An entity is described by a set of attributes.
– Each attribute has a domain of possible values (.e.g., 20-char. strings)
• Entity Set: a collection of similar entities (rectangle)
• Each entity in an entity set is uniquely identified by a key attribute.
9
ER Model: Relationship
(5/2001)
name
since
did
ssn
(Joe) Employees
dname
Works_In
budget
Departments
(finance dept)
• Relationship: Association among two or more entities
– Joe works in finance department.
– A relationship must be uniquely identified by the participating entities,
without reference to the descriptive attributes. For example the pair
<ssn, did>
• A relationship can have descriptive attributes.
– Joe has worked in finance department since 5/2001.
• Relationship Set: Collection of similar relationships.
10
ER Model: Relationship (An
Instance)
name
since
dname
budget
did
ssn
Employees
Mary
Joe
Alice
Peter
Works_In
3/3/93
2/2/92
3/1/92
2/1/92
1/1/92
Many-to-Many
Departments
Finance
Accounting
Research
Legal
11
Ternary Relationship
name
since
budget
did
ssn
(Joe)
dname
Employees
address
Works_In
Locations
Departments
(finance dept)
capacity
(Taipei)
12
Roles in Relationship
name
ssn
(Roles)
Employees
supervisor
subordinate
Reports_To
13
Key Constraints
• Describe at most once (entity) relationship
– Manages relationship: each department has at most one manager
(okay to have none).
– One department can appear at most once in Manages relationship
set, also called one-to-many relation.
name
dname
since
did
ssn
Employees
Mary
Joe
Alice
Peter
Manages
3/3/93
2/2/92
3/1/92
budget
Departments
Finance
Accounting
Research
Legal
14
More Key Constraints
Women
1-to-1
Men
Married
Give Birth
Babies
1-to-Many
Women
Many-to-Many
Men
Befriends
Women
15
Participation Constraints
• Describe all (entity) participation relationship
– Must every department have a manager?
•
–
If yes, this is a participation constraint
All Departments entities must participate in the Manages relationship
set (total participation).
since
name
ssn
did
lot
Employees
dname
Manages
budget
Departments
Works_In
since
16
Weak Entities
• A weak entity can be identified uniquely only by considering
the key of another (owner) entity.
–
–
–
Pname = partial key (of the weak entity set, i.e., “Dependents”)
Owner entity set and weak entity set must participate in a one-tomany relationship set (one owner, many weak entities).
Weak entity set must have total participation in this identifying
relationship set.
(Alicia)
name
cost
ssn
(Hao)
Employees
Policy
(2)
pname
age
Dependents
17
ISA (`is a’) Hierarchies
name
ssn
Employees
hourly_wages
subclass entities
•
•
hours_worked
Hourly_Emps
ISA
superclass entity
contractid
Contract_Emps
As in C++ and OO languages, attributes are inherited from
superclass.
A ISA B, every A entity is also considered to be a B entity.
•
Why using ISA?
•
•
Add descriptive attributes specific (make sense) to a subclass.
Identify entities that make sense to a relationship (policy).
18
ISA (`is a’) Constraints
name
ssn
Employees
hourly_wages
hours_worked
Hourly_Emps
ISA
contractid
Contract_Emps
• Overlap constraints: Can Joe be an Hourly_Emps as well as a
Contract_Emps entity? (Allowed/disallowed)
• Covering constraints: Does every Employees entity also have to
be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
19
Aggregation
•
•
Create relationship set from
relationship sets.
Aggregation: relationship set
turns into an entity set
–
ssn
name
Employees
Monitors
So that they can participate in
(other) relationships.
since
started_on
pid
pbudget
Projects
until
Sponsors
dname
did
budget
Departments
20
Design Guideline
1. Avoid redundancy.
2. Don’t use an entity set when an attribute
will do.
3. Limit the use of weak entity sets.
21
Avoiding Redundancy
• Redundancy occurs when we say the same thing
in two different ways.
• Redundancy is bad
– wastes space
– encourages inconsistency.
• The two instances of the same fact may become inconsistent
if we change one and forget to change the other instance.
22
Redundancy Example
name
Beers
name
ManfBy
addr
Manfs
manf
This design states the manufacturer of a beer twice:
as an attribute and as a related entity.
23
Fix Redundancy
name
Beers
name
ManfBy
addr
Manfs
This design gives the address of each manufacturer
exactly once.
24
Example: Bad
name
manf
manfAddr
Beers
This design repeats the manufacturer’s address once
for each beer.
Why is it bad?
•Manf updates its address.
•Loses the address if there are temporarily no
beers for a manufacturer.
25
Exercise 2.2 (R-G Book)
A university database contains information about professors
(identified by social security number) and courses
(identified by courseid). Professors teach courses; each of
the following situations concerns the Teaches relationship
set. For each situation, draw an ER diagram that describes
it.
• Professors can teach the same course in several
semesters, and each offering must be recorded.
26
• Professors can teach the same course in several
semesters, and only the most recent such offering needs
to be recorded.
• Every professor must teach some courses
27
• Every professor teaches exactly one course (no more, no
less)
• Every professor teaches exactly one course (no more, no
less), and every course must be taught by some professor
28
Exercise 2.3 (R-G Book)
• Professors have an SSN, a name, an age, a rank, and a research specialty.
• Projects have a project number, a sponsor name (e.g., NSF), a starting date, an
ending date, and a budget.
29
•
•
•
•
Graduate students have an SSN, a name, an age, and a degree program
Each project is managed by exactly one professor (known as PI)
Each project is worked in by one or more professors (known as Co-PIs)
Each project is worked on by one or more graduate students (known as RAs)
30
• When graduate students work on a project, a professor must supervise their work
on the project. Graduate students can work on multiple projects, in which case
they will have a potentially different supervisor for each one
• Departments have a department number, a department name, and a main office.
• Department has a professor (known as Chairman) who runs the department.
31
• Professors work in one or more departments, and for each department that they
work in, a time percentage is associated with their job
• Graduate students have one major department in which they are working on their
degree.
• Each graduate student must have another, more senior graduate student as an
advisor.
32
Summary
• ER model is popular for conceptual design
– Sketch the design of a database informally using pictures
• Basic constructs in ER model:
– entities, relationships, and attributes (of entities and relationships).
• Some additional constructs:
– weak entities, ISA hierarchies, and aggregation.
• Several kinds of integrity constraints:
– key constraints, participation constraints, and overlap/covering
constraints for ISA hierarchies.
• Design guideline in ER model
33
Entity Sets Versus Attributes
•
Modeling a concept with a new entity set
should satisfy at least one of the following
conditions:
–
–
It is more than the name of something; it has at least
one nonkey attribute.
or
It is the “many” in a many-one or many-many
relationship.
34
Example: Okay
name
Beers
name
ManfBy
addr
Manfs
•Manfs deserves to be an entity set because of the
nonkey attribute addr.
•Beers deserves to be an entity set because it is the
“many” of the many-one relationship ManfBy.
35
Example: Beers Entity not Needed
name
Beers
name
ManfBy
addr
Manfs
•Beers can be an attribute rather than an entity.
36
Example: Okay
name
manf
Beers
There is no need to make the manufacturer an entity
set, because we record nothing about manufacturers
besides their name.
37
Example: Bad
name
Beers
name
ManfBy
Manfs
Since the manufacturer is nothing but a name, and is
not at the “many” end of any relationship, it should
not be an entity set.
38
Don’t Overuse Weak Entity Sets
• Beginning database designers often doubt that anything
could be a key by itself.
– They make all entity sets weak, supported by all other entity
sets to which they are linked.
• In reality, we usually create unique ID’s for entity sets.
– Examples include social-security numbers, automobile VIN’s
etc.
39
When Do We Need Weak
Entity Sets?
• The usual reason is that there is no global
authority capable of creating unique ID’s.
• Example: it is unlikely that there could be
an agreement to assign unique player
numbers across all football teams in the
world.
40