Database Design for 499
Download
Report
Transcript Database Design for 499
The Database Development
Process
Dr. Emad M. Alsukhni
Modern Database Management
7th Edition
Jeffrey A. Hoffer, Mary B. Prescott,
Fred R. McFadden
© 2005 by Prentice Hall
1
Systems Development Life Cycle (cont.)
Project Identification
and Selection
Project Initiation
and Planning
Analysis
Analysis
2- logical database design
• The conceptual data model
must be transformed into
a logical data model, which describes
the data in terms of the data management
technology that will be used to implement
the database
1- Database activity –
Logical Design
conceptual data modeling
• Analyst produces a detailed
Physical Design
data model that identifies all the
organizational data that must be managed
for this information system( every data attribute is Implementation
defined, all categories of data are listed, every business
relationship between data entities is represented and every rule is
specified)
Dr. Emad M. Alsukhni
Maintenance
2
Enterprise Data Model
First step in database development
Specifies scope and general content
Its purpose is to create an overall picture of
organizational data at high level of abstraction,
not the design for a particular database
Entity-relationship diagram
Descriptions of entity types
Relationships between entities
Business rules
Dr. Emad M. Alsukhni
3
Figure 2-1 Segment from enterprise data model (Pine
Valley Furniture Company) [simplified E-R diagram,
repeat of figure 1.3]
Enterprise data model describes
the high-level entities in an
organization and the
relationship between these
entities
Dr. Emad M. Alsukhni
4
Systems Development Life Cycle (cont.)
(figures 2-4, 2-5)
Purpose – programming, testing, training,
installation, documenting
Deliverable – operational programs,
documentation, training materials
Project Identification
and Selection
Project Initiation
and Planning
Analysis
Database activity –
database implementation
Logical Design
• Designer writes, tests, and
installs the programs that process
Physical Design
the database
• The designer will finalize all database
Implementation
Implementation
documentation, train users, put procedures
into place for the ongoing support of information system
Maintenance
(and databse) users and load data from existing information sources
(files, and databases from legacy applications plus data now needed)
Dr. Emad M. Alsukhni
5
Systems Development Life Cycle (cont.)
(figures 2-4, 2-5)
Project Identification
and Selection
Purpose – monitor, repair, enhance
Deliverable – periodic audits
Project Initiation
and Planning
Analysis
Logical Design
Database activity –
database maintenance
Physical Design
• The designer adds, deletes, or
changes characteristics of the structure
Implementation
of a database in order to meet changing
business conditions
• Correct errors in database design
• Improve the processing speed of database applications
Dr. Emad M. Alsukhni
Maintenance
Maintenance
6
Dr. Emad M. Alsukhni
7
Database Schema
External Schema
This is the view (or views) of managers and other
employees who are the database users. Can be
represented as a combination of the enterprise
data model and a collection of detailed user views
user view is a logical description of some portion
of an enterprise database that is required by a
user to perform some task
Example, online order form used by customers of
PVFC
Dr. Emad M. Alsukhni
8
Database Schema
Conceptual Schema
Combines the different external views into a single, coherent
definition of the enterprise’s data
detailed specification of the overall structure of
organizational data that is independent of any database
management technology
Internal Schema: consists of two separate schemas
Logical schema: is the representation of data for a type of
data management technology (e.g., relational)
Physical schema: describes how data are to be represented
and stored in secondary storage using a particular DBMS
(e.g., Oracle)
Dr. Emad M. Alsukhni
9
Different people
have different
views of the
database…these
are the external
schema
The internal
schema is the
underlying
design and
implementation
Dr. Emad M. Alsukhni
10
Strategies for Development
First, you will develop a first-cut conceptual schema based on the
organization’s enterprise data model and a general understanding of
the organization's data requirements on a project
Then you will develop user views to represent data needs for the
various users of this database
In most cases, an analysis of the user views will yield new attributes
and possibly entities and relationships not shown on the conceptual
schema
So, the conceptual schema is augmented with these requirements ,
thus making the conceptual and external schemas consistent
This process of evolving the two schemas continues until the schemas
are fully defined
Then the internal schema (logical and physical models) are developed
Dr. Emad M. Alsukhni
11
CASE
Computer-Aided Software Engineering
(CASE) – software tools providing
automated support for systems
development
Dr. Emad M. Alsukhni
12
Relationship symbols
Entity
symbols
Attribute
symbols
A special entity
that is also a
relationship
Relationship
degrees specify
number of
entity types
involved
Relationship
cardinalities
specify how
many of each
entity type is
allowed
Dr. Emad M. Alsukhni
13
Sample E-R Diagram (Figure 3-1)
Dr. Emad M. Alsukhni
14
What Should an Entity Be?
SHOULD BE:
An object that will have many instances in
the database
An object that will be composed of multiple
attributes
An object that we are trying to model
SHOULD NOT BE:
A user of the database system
An output of the database system (e.g. a
report)
Dr. Emad M. Alsukhni
15
Figure 3-4
Inappropriate entities
System output
System user
Appropriate entities
Dr. Emad M. Alsukhni
16
Attributes
Attribute - property or characteristic of an
entity type
Classifications of attributes:
Required versus Optional Attributes
Simple versus Composite Attribute
Single-Valued versus Multivalued Attribute
Stored versus Derived Attributes
Identifier Attributes
Dr. Emad M. Alsukhni
17
Identifiers (Keys)
Identifier (Key) - An attribute (or
combination of attributes) that uniquely
identifies individual instances of an entity
type
Simple Key versus Composite Key
Candidate Key – an attribute that could be
a key…satisfies the requirements for being
a key
Dr. Emad M. Alsukhni
18
Characteristics of Identifiers
Will not change in value
Will not be null
No intelligent identifiers (e.g. containing
locations or people that might change)
Substitute new, simple keys for long,
composite keys
Dr. Emad M. Alsukhni
19
Figure 3-7 – A composite attribute
An attribute
broken into
component parts
Dr. Emad M. Alsukhni
20
Figure 3-9a – Simple key attribute
The key is underlined
Dr. Emad M. Alsukhni
21
Figure 3-9b – Composite key attribute
The key is composed
of two subparts
Dr. Emad M. Alsukhni
22
Figure 3-8 – Entity with a multivalued attribute (Skill) and
derived attribute (Years_Employed)
What’s wrong with this?
Multivalued:
Derived
from date employed and current date
Dr. Emad M. Alsukhni
an employee can have
more than one skill
23
Figure 3-19 – An attribute that is both multivalued and composite
This is an
example of
time-stamping
Dr. Emad M. Alsukhni
24
More on Relationships
Relationship Types vs. Relationship Instances
Relationships can have attributes
The relationship type is modeled as the diamond
and lines between entity types…the instance is
between specific entity instances
These describe features pertaining to the association
between the entities in the relationship
Two entities can have more than one type of
relationship between them (multiple
relationships)
Associative Entity – combination of
relationship and entity
Dr. Emad M. Alsukhni
25
Cardinality of Relationships
One-to-One
One-to-Many
Each entity in the relationship will have exactly one
related entity
An entity on one side of the relationship can have
many related entities, but an entity on the other side
will have a maximum of one related entity
Many-to-Many
Entities on both sides of the relationship can have
many related entities on the other side
Dr. Emad M. Alsukhni
26
Dr. Emad M. Alsukhni
27
Dr. Emad M. Alsukhni
28
Entities can be related to one another in more than one way
Dr. Emad M. Alsukhni
29
Relation
Definition: A relation is a named, two-dimensional table of
data
Table consists of rows (records), and columns (attribute or
field)
Requirements for a table to qualify as a relation:
It must have a unique name.
Every attribute value must be atomic (not multivalued, not
composite)
Every row must be unique (can’t have two rows with exactly the
same values for all their fields)
Attributes (columns) in tables must have unique names
The order of the columns must be irrelevant
The order of the rows must be irrelevant
NOTE: all relations are in
1st Normal form
Dr. Emad M. Alsukhni
30
Correspondence with E-R Model
Relations (tables) correspond with entity types
and with many-to-many relationship types
Rows correspond with entity instances and with
many-to-many relationship instances
Columns correspond with attributes
NOTE: The word relation (in relational
database) is NOT the same as the word
relationship (in E-R model)
Dr. Emad M. Alsukhni
31
Key Fields
Keys are special fields that serve two main purposes:
Primary keys are unique identifiers of the relation in question.
Examples include employee numbers, social security numbers,
etc. This is how we can guarantee that all rows are unique
Foreign keys are identifiers that enable a dependent relation
(on the many side of a relationship) to refer to its parent relation
(on the one side of the relationship)
Keys can be simple (a single field) or composite (more
than one field)
Keys usually are used as indexes to speed up the
response to user queries (More on this in Ch. 6)
Dr. Emad M. Alsukhni
32
Figure 5-5:
Referential integrity constraints (Pine Valley Furniture)
Referential
integrity
constraints are
drawn via arrows
from dependent to
parent table
Dr. Emad M. Alsukhni
33
Referential
integrity
constraints are
implemented with
foreign key to
primary key
references
Dr. Emad M. Alsukhni
34
Data Normalization
Primarily a tool to validate and improve
a logical design so that it satisfies
certain constraints that avoid
unnecessary duplication of
data
The process of decomposing relations
with anomalies to produce smaller,
well-structured relations
Dr. Emad M. Alsukhni
35
Well-Structured Relations
A relation that contains minimal data redundancy and
allows users to insert, delete, and update rows
without causing data inconsistencies
Goal is to avoid anomalies
Insertion Anomaly – adding new rows forces user to
create duplicate data
Deletion Anomaly – deleting rows may cause a loss of
data that would be needed for other future rows
Modification Anomaly – changing data in a row forces
changes to other rows because of duplication
General rule of thumb: a table should not pertain to
more than one entity type
Dr. Emad M. Alsukhni
36
Example – Figure 5.2b
Question – Is this a relation?
Answer – Yes: unique rows and no multivalued
attributes
Question – What’s the primary key?
Answer – Composite: Emp_ID,
Course_Title
Dr. Emad M. Alsukhni
37
Anomalies in this Table
Insertion – can’t enter a new employee without
having the employee take a class
Deletion – if we remove employee 140, we lose
information about the existence of a Tax Acc class
Modification – giving a salary increase to
employee 100 forces us to update multiple records
Why do these anomalies exist?
Because there are two themes (entity types) into one
relation. This results in duplication, and an
unnecessary dependency between the entities
Dr. Emad M. Alsukhni
38
Figure 5.22 Steps in
normalization
Dr. Emad M. Alsukhni
39
First Normal Form
No multivalued attributes
Every attribute value is atomic
Fig. 5-25 is not in 1st Normal Form
(multivalued attributes) it is not
a relation
Fig. 5-26 is in 1st Normal form
All relations are in 1st Normal
Form
Dr. Emad M. Alsukhni
40
Table with multivalued attributes, not in 1st normal form
Note: this is NOT a relation
Dr. Emad M. Alsukhni
41
Table with no multivalued attributes and unique rows, in 1st
normal form
Note: this is relation, but not a well-structured one
Dr. Emad M. Alsukhni
42
Anomalies in this Table
Insertion – if new product is ordered for order
1007 of existing customer, customer data must be
re-entered, causing duplication
Deletion – if we delete the Dining Table from Order
1006, we lose information concerning this item's
finish and price
Update – changing the price of product ID 4
requires update in several records
Why do these anomalies exist?
Because there are multiple themes (entity types) into
one relation. This results in duplication, and an
unnecessary dependency between the entities
Dr. Emad M. Alsukhni
43
Second Normal Form
1NF PLUS every non-key attribute is
fully functionally dependent on the
ENTIRE primary key
Every non-key attribute must be defined by
the entire key, not by only part of the key
No partial functional dependencies
Dr. Emad M. Alsukhni
44
Order_ID Order_Date, Customer_ID, Customer_Name, Customer_Address
Customer_ID Customer_Name, Customer_Address
Product_ID Product_Description, Product_Finish, Unit_Price
Order_ID, Product_ID Order_Quantity
Therefore, NOT in 2nd Normal Form
Dr. Emad M. Alsukhni
45
Getting it into Second Normal Form
Partial Dependencies are removed, but there
are still transitive dependencies
Dr. Emad M. Alsukhni
46
Third Normal Form
2NF PLUS no transitive dependencies
(functional dependencies on non-primary-key
attributes)
Note: this is called transitive, because the
primary key is a determinant for another
attribute, which in turn is a determinant for a
third
Solution: non-key determinant with transitive
dependencies go into a new table; non-key
determinant becomes primary key in the new
table and stays as foreign key in the old table
Dr. Emad M. Alsukhni
47
Getting it into Third Normal Form
Transitive dependencies are removed
Dr. Emad M. Alsukhni
48