Transcript db3

Database -- III
(Database Design)
By
Chandra s. Amaravadi
1
IN THIS PRESENTATION..
Database environment
Database planning
Data modeling
Database design
Database applications
2
DATABASE
DESIGN
3
DATABASE DESIGN
The objective of database design is to develop a set of
well structured tables so that:
Data is in the most efficient form
No uncontrolled redundancies
Queries/reporting facilitated
Database can be easily implemented
Design process is called normalization.
4
AN INEFFICIENT FORM
EMPLOYEE
EID
Name
Title
Dt. promoted
2356
Armstrong
Analyst
4/14/09
3286
Nickerson
Analyst, Sr.
Analyst
5/1/09; 2/5/11
Why is this an inefficient form?
5
THE DESIGN PROCESS
Putting data into its most efficient form is called normalization
Principle is to find an attribute set which is uniquely associated
with the value of an attr. Example:
Prod_ID  stock#
student ID -- gpa
student id, sem -- semester gpa
Called functional dependency
6
FUNCTIONAL DEPENDENCY
Functional dependency:
A relationship between two attributes a & b such that if we know a,
we can uniquely determine b
a --> b
; “a determines b”,
; “b is dependent on a”
Relationship between attributes (L -> R unless specified)
a  b is referred to as a functional dependency diagram
Each value of a is associated with one value of b (FD test)
For a given value of “b” there can be many values of “a”
7
FD DIAGRAM CONVENTION
a
b;
a
a
b;
b;
a
a determines b; b is determined by a;
b is dependent on a (also full F.D.)
for each a, multiple values of b
no functional dependency (generally not shown)
b, c, d;
a determines b, c, d
b, c, d;
a determines b, c, d
or
a
8
FD EXAMPLES
pp#
flt#
student id
player
name of issuing country
name of captain
GPA
team name
pp#
GPA
prod descr.
Student id
visa#’s
student id
prod price
award
price
gpa
descr.
descr.
Note: valid FDs are in top group
9
FUNCTIONAL DEPENDENCY DIAGRAMS
DRAWING FD DIAGRAMS:
The first step in design is to draw an FD diagram
Shows relationships among two or more attr.
List all attributes;
Candidate keys leftmost
for each attribute (other than candidate key) see what determines it:
If data is given, do FD test using data
If data is not given do FD test by imaginary cases
Note: candidate key is a key that can serve as the primary key
10
FD WHEN DATA IS GIVEN
PLOTS
parcelID
owner
zone
co-owners
11-33-984
Garner
residential
Midwest
National
12-42-933
Reid
commercial
Reid
67-83-983
Hengst
industrial
North West
bank & trust
68-94-985
Hengst
residential
Mckinney, Wall
Parcel ID  Owner?
Owner  Parcel ID?
Parcel ID  Zone?
Owner  Zone?
Zone  Co-owner?
11
FD WHEN DATA IS NOT GIVEN
Cust places multiple orders
A flight has multiple pilots
Each pet has a single owner
Relationship between ss#, user name (for an online store)?
User name, ss#, web site, company name, cust. credit card#
12
FD EXERCISES
A person (PID) can occur in one or more photographs
(designated by photoID).
In a manufacturing situation, each assembly consists of
a number of parts each with part#, cost, material and
acceptable tolerance. The assembly itself has a name
(unique), a description, instructions (one large text field) and
the parts of which it is comprised.
A ship has a number of characteristics including a
a unique name, a registry, dead weight tonnage (DWT), gross rated
tonnage (GRT), maximum displacement , deck equipment and
cargo type. Assume multiple cargo types. Further, the deck equipment
has an equipment ID, equipment type, maximum capacity and year installed
as attributes.
13
FD EXERCISES..
Draw a FD diagram for the following attributes:
ST#, C#, C_name, St_name, St_addr, St_ph, #units,
Sect#, GPA, grade, major, St_yr, text, room#, day, time
Assume:
1. A student has only one major
2. Grade refers to the grade a student receives in a course
3. A course can have several sections
4. Each section of a course can have a different text
5. Each section of a course is in a different room
6. day & time refer to the days of week and timing for section
14
MORE FD/ER EXERCISES..
A database design for aircraft maintenance work at a large airport is required.
The airport has a number of terminals (“T#”), each with its own set of hangars.
Hangars are designated by a H#, (H1, H2..) which is unique to a terminal, but
non-unique across the airport. Planes designated by FAA# are stored in
hangars. Each hangar can accommodate multiple planes. It is important to
know for each plane, the fuel type, the year manufactured, manufacturer, the
type of maintenance it is scheduled for (“engine overhaul,” “fuel filter
replacement,” “lubrication of wing equipment,” etc.), the start and end dates of
the maintenance. The manuals for each manufacturer are stored by shelf#. in
the hangar. Draw a) ER diagram, b) FD diagram for the situation above.
15
MORE FD/ER EXERCISES..
Double day inc., a book distributor wants to develop a database of its
publishers, book authors and book stores. A book distributor buys books from
publishers and sells them to book stores. Publishers such as McGraw Hill
and Prentice Hall sell books directly or more frequently rely on distributors.
They have a contact person in sales, with whom the distributor places orders.
Depending on the number of copies ordered [“volume”], publishers give
discounts to distributors, ranging from 25%-50 %. There are several titles per
order and a given title can be ordered multiple number of times. The
distributor needs information on the: Title, ISBN#, Volume Ordered, Qty in
stock, List price and Discount. The distributor needs to have information on
the publisher code, publisher name, address, contact name and contact phone.
Each book can have multiple authors. An author could write several books.
The distributor needs to have information on the author name address and
phone, so that they may be contacted for promotion purposes. For the stores,
the distributor wants information on the contact person, phone#, Store#, name
and location.
16
THE DESIGN PROCESS
17
DATABASE DESIGN..
Design: Process of grouping attributes into tables
Examples of FDs:
SS# ---> name, age, sex etc.
distance,class --> airfare.
ISBN# --> book title, price etc.
Determinant:
the LHS of the FD is known as determinant and is usually the pkey
Design:
Each FD is placed in a separate table with determinant as pkey
e.g. Students( SS#, name, age, sex)
Mistake here?
Fare( distance, class, fare)
Books( ISBN#, book title, price)
18
THE UNION RULE OF FD
If two attributes in two FDs share the same LHS, then it should
be listed as one FD.
Student ID  name
Student ID  major
Then
StudentID  name, major
19
DESIGN CONCEPTS
Primary key:
An attribute whose value is
unique within an entity class (table)
e.g. SS#, Part# etc.
Candidate key:
A key that can serve as
the primary key
Foreign key/
Cross-reference
key:
A key that serves as reference
between two relations
Customers
Orders
Cust#, Name, Address..
Ord#, Ord_dt, .. Cust#
When do you need foreign keys?
20
DESIGN CONCEPTS..
Determinant:
LHS of a functional dependency
Functional
dependency:
A relationship between two or more
attr. such that if we know A, we are able
To uniquely determine b, c..
Union rule:
If two FDs have the same LHS,
then they should be combined.
21
DATABASE APPLICATION
&
MANAGEMENT
22
FOR DISCUSSION
What is a good database application?
What sort of issues arise when using a database
as part of an application?
What sort of management issues arise when
a) developing a database?
b) operating a database? (i.e. other than security)
23
DATABASE WORTHY?
Course schedule?
An individual sales receipt?
Book catalog
Photos of buildings/artifacts at a historical place
24
DATABASE ADMINISTRATION (FYI)
The technical management of database systems
Database Administration Functions
Responsible of all phases of development
Establishing accts/system security, backup
Tuning the database
Establishing & enforcing database standards & guidelines
25
DISCUSSION QUESTIONS
What would be the smallest database that one can create? Explain.
How would you go about initiating a database project? Explain.
Is it possible to retrieve data from five tables simultaneously? Explain.
Under what design condition(s) are all attributes placed in a single table?
Explain.
How would a production DBMS differ from one that is used mostly by
end users? (Note a prodn. DBMS is one that is used as part of an IS, e.g. a
product database in an ordering system).
What sort of problems do you anticipate arising in very large databases?
A sales manager in a retail organization has a large database of sales
information. Who in the organization should analyze this info?
26
THE END!
27