Logical Data Model

Download Report

Transcript Logical Data Model

Data Warehouse
Fundamentals
Chapter 6: Relational Data Modeling
-- CONCEPTS, PRINCIPLES & APPROACHES--
Paul Chen
www.cs522.com (Please reference white papers on Data Modeling
at Seattle U teaching materials website)
Topics
1.
2.
3.
4.
5.
6.
7
Levels of Modeling
Relational Data Modeling—What is it? Types of Models
and Pre-Modeling Activities
Understand Terms and Terminology –Tool Demonstration
Conceptual Data Modeling: What, Why, When, Who?
Activity Description
Logical Data Modeling: What, Why, When, Who? Activity
Description
Physical Data Modeling- An Overview
Prototyping and RAD
Databases & Modeling
Databases
&
Modeling
Type of
Database
Relational
Database
Constructs
ERD & EER
Characteristics
Row/
Column
Multi-Dimensional
Database
Dimensional
Modeling
Cube
Distributed
Database
Distributed
Component
Object Model
Client
Object
(DCOM)
Object-Oriented
Database
Class
Diagram
New
Trend
Object
Object = Data + Operations(Services);
Entity = Data only
OLAP
DW
XML
UML
Topic 1: Level of Modeling
Descriptive: The dealer sold 200 cars last month.
Operational
(OLTP)
Primarily Two Dimensional
Database System
Explanatory: For every increase in 1 % in the interest,
auto sales decrease by 5 %.
Traditional DW
(OLAP)
Star Schema Cube
Predictive: Predictions about future buyer behavior.
Data Mining
Cube + sophisticated
analytical
tools
Level of Analytical Processing
Descriptive
SIMPLE QUERIES
& REPORTS
Explanatory
Predictive
“WHAT IF”
PROCESSING
DETERMINE IF
ANY PATTERNS
EXIST BY REVIEWING
DATA RELATIONSHIPS
ANALYZE WHAT
HAS PREVIOUSLY
OCCURRED TO
BRING ABOUT THE
CURRENT STATE
OF THE DATA
Normalized
Tables
Query
Dimensional
Tables
Roll-up; Drill Down
+
Statistical Analysis/Expert
System/
Artificial Intelligence
Classification & Value Prediction
DESCRIPTIVE MODELING
Relational Data Modeling using ER Diagram

Conceptual Data Model (Analysis - Requirements
Gathering; What’s it?)

Logical Data Model (Design-How is it?)

Physical Data Model (Implementation)
EXPLANATORY MODELING



Also called Dimensional Modelling (to be
discussed in chapter 7)
Ways to derive the database component of a data
warehouse
Every dimensional model (DM) is composed of one
table with a composite primary key, called the fact
table, and a set of smaller tables called dimension
tables.
PREDICTIVE MODELING (to be
discussed in chapter 10)

Similar to the human learning experience
– Uses observations to form a model of the
important characteristics of some phenomenon.

Uses generalizations of ‘real world’ and ability to
fit new data into a general framework.

Can analyze a database to determine essential
characteristics (model) about the data set.
Topic 2: Relational Data
Modeling-What’s it?
A data model is a collection of constructs, business
rules and sample data which together supports a
dynamic representation of real world objects and
events.



Constructs: entity-relationship diagrams (conceptual & logical
or functional) and tables
Business rules: constraints such as referential integrity rules
and operators (add, update, delete)
Sample Data for verification and prototyping: Verifying the
accuracy of the model.
Objectives

Testing the real system before building it.

Assisting in understanding an organization’s data
requirements.

Facilitating physical data base design.
Types of Models

Conceptual Data Model (Analysis - Requirements
Gathering; What’s it?)

Logical Data Model (Design-How is it?)

Physical Data Model (Implementation)
Pre-modeling Activities

Data collection phase

Facilitation techniques (for ex. JAD session)

Roles and responsibilities

Tools and repository
Naming standards
Modeling convention (What methods to use)
Data protection/backup and recovery procedures



Data Collection Phase *



Sampling and Investigating Hard Data
The Needs for Sampling:
Containing costs; Speeding up the data
gathering;
Improving effectiveness; Reducing bias
Sampling Design -Four steps:
1. Determine the data to be collected or described
2. Determine the population to be sampled
3. Choose the type of sample
4. Decide on the sample size
* Please review chapter 5
Kinds of Information Sought
in Investigation
Types of hard Data (other than interviewing and
Observation)






Memos
Signs on bulletins boards or in work areas
Corporate Web sites
Manuals
Policy handbooks
Record layout
Retina Scan
“That recent Tom Cruise movie, Minority Report, shows
advertising that targets each individual consumer as
they pass by the signage. That’s the extreme, but I can
see it going that way,” said St. Denis.
Five Steps in Interview
Preparation

Reading background material

Establishing interview objectives

Deciding when to interview

Preparing the interviewee

Deciding on question type and structure
Two Types of Questions –
Open-End Questions vs.
Closed Questions
Closed interview questions
Such as “ How many subordinates do you have?
Benefits:
 Getting to relevant data
 Keeping control over the interview
Drawbacks:
 Failing to obtain rich detail
 Intimidating the interviewee
Three Basic ways of
Structuring Interviews

Pyramid Structure: Starting from closed questions,
then gradually expand into open territory.

Funnel Structure: The reverse of pyramid structure
approach.

Diamond-Shaped: A combination of the two above
structures.
JAD (Joint Application
Development)

JAD sessions (also called facilitated session) are used to gather
information and feedback and confirm the results of
requirements gathering.

JAD sessions replace the traditional way of conducting a
series of interviews on a one-to-one basis with the users.
Advantages: Achieving consensus during the session when
multiple sources of information exist, raising and
addressing issues or assigning them for resolution, and
immediately confirming information.
Topic 3: Understand Terms and
Terminology

Independent entity

Dependent entity

Associative Entity

Identifying relationship

Non-identifying relationship
Understand Terms and
Terminology
Identifier
An attribute distinctly identifies each occurrence of
an entity.
For ex., bank account Id. , and student Id.
 Association (relationships)
An association is a relationship between two or
more entities.
Employee
works for company
Part
has
item

Understand Terms and
Terminology
Cardinality(the form of relationship)
Associations occur in there forms:
one-to-one; one-to-many; many-to-many
 Tables
A table is a two-dimensional representation of data
consisting of columns and rows.
 Primary Key
Used to identify entities.
Unique identification for a row in a table.
Allow no nulls and no duplicates.
May be system assigned.

Understand Terms and
Terminology


Foreign key
A foreign key is one or more data elements whose value is
based on the primary identifier of another entity, thus
allowing the system to ‘join’ and get related information from
other entities. The ‘joining’ of different entities in this manner
eliminates the need of data repetition and redundancy.
Normalization
A technique to make sure that the data in a logical model is
defined once and only once. Normalization helps minimum
data redundancy, and minimize update abnormalities.
Topic 4: Conceptual Data
Modeling: What?
What is it?
1.
It is a conceptual representation of data without
concern for its logical (functional) or physical
aspects.
2.
It is a set of high-level business data models
which provides a framework for the data
modeling activities at the next level.
Conceptual Data Modeling:
When?
When should it be done?
1.
In support of the data requirements of a process
model under development at the corresponding
level.
or
2.
Outside the system application lifecycle on a
department, division, or company wide basis.
Conceptual Data Modeling:
Who?
Who should do it?
1.
The group responsible for assuring that data
structure reflects business policies and rules.
2.
It should be a joint effort between the owners and
custodians of the data, the users of the data, and
the analysts.
Conceptual Data Modeling:
Why?

Documents the type of data (information) which
must be represented in a system independent of
specific application, organizations, or technology.

Maximizes data sharing; minimizing redundancy.

Provides foundations for physical database design.

Describes the unique business enterprise
specifically.
Conceptual Data Modeling:
Why?

Outside of application life cycle on a company-wide basis.

Data modeling expresses inherent associations which are the
most part, independent of anyone one application.

Data entities change very little even through the way they are
used can change for each application.

A complete maintained conceptual data model should shorten
the requirements definition phases of system development
life cycle.
Data Modeling: Approach

Data partitioning
Use a top-down approach to define the data
requirements of a system. The purpose is to divide
and conquer (from subject to entity), and to evolve
from the conceptual level to logical level until
physical database is derived.

Standard deliverables
For each of the levels, there is a set of standard
deliverables that must be produced. The
documentation items must be well defined so that
the data at each level is well understood.
Data Partitioning Via
Modeling
(How)
(What, Why, Who, Where)
Conceptual Level
Subjects
Entities
Technical considerations
Relationships
Logical Level
Physical
Level
Data Elements
Frequencies
Data Definition Language -DDL(create, Alter, drop tables)
Data Manipulation Language (select, insert, delete, update)
Conceptual Data Modeling -Activity Description
1. Define the system boundary – Data Context Diagram.
2. Partition a subject into entities.
3. Discover entities -super-type and subtype; part and
whole.
4. Define associations between entities.
5. Define major attributes.
6. Define unique identifier for the entity.
7. Assign cardinalities and set up relationships between
the entities.
8. Validate the model with the users.
1. Define the System Boundary by
Subject Context Diagram

A subject (a group of entities with strong affinity) is a
class of data objects representing the mission and
resources of the organization.

“Subjects” provides mechanisms for controlling how
much of a model a reader (user, analyst, manager) is able
to consider and comprehend at a time. For a small system,
go directly to define entities.
For ex: Library (subject) decomposed into book, member,
and account. (entities).
ATM ERD –Subject Context
Level
Customer
uses
ATM
Has
Owns
Bank
Consortium
Consists
of
Affiliated
Bank
Account
Holds
Data (Subject) Context
Diagram

A Data context diagram is a special case of ERD in which a
single diagram represents the problem domain in terms of
data requirements.
For example: The ATM diagram illustrates and highlights
Several important characteristics of the system:


The people and organization with which the
system communicates.
It documents the significant connections (relationships)
between the data objects within as well as outside the
problem domain.
2. Partition a Subject into
Entities
Criteria for partitioning a subject into entities
are:
1. The entities included in a subject all tend to describe the
subject and have a strong affinity with the subject.
2.
The entities of a subject should be of equal importance, as
measured by the range, complexities or importance of their
data.
3.
Each entity should belong to only one subject.
3. Discover Entities

Identifying data objects via business event analysis
An individual stimulus from one data object to another is an
event. Events are discovered by investigating the external
influences that act upon a system, and the data transformation
that occurs within a system that converts inputs into outputs.
Thus, source and target data objects that interact with an event
can be identified. For example, an event “customer buys a
product” as depicted below is initiated by the data object
“customer” who requires a response from the data object
“sale”, “product”, and “payment”.
Customer
Sale
Customer buys
A Product
Payment
Product
Types of Entities
To find potential entities (entities are nouns), look for:
1.
2.
3.
Objects can be generalized or specialized
The entities of a subject should be of equal
importance, as measured by the range,
complexities or importance of their data.
Each entity should belong to only one subject
Subtype and Super-type
Faculty
Full Time
Faculty
Part Time
Faculty
Part (day)
Time
Faculty
Part (Night)Time
Faculty
Generalization &
Specialization
Generalization
Commercial
747
Specialization
Aircraft
Military
777
B52
B-1B
4. Define Associations
Between Entities
Use a verb to describe associations between two or
more entities that the user wants to keep track of. Each
relationship must be specific as possible so that its
meaning is clear.
An individual owns a building.
Owns: possession, rental; or management?
5. Define Major Attributes
All the attributes of an entity must have meaning for each
and every one of the occurrence of the entity. Only
elementary data are included in the model. Attributes
resulting from process algorithms should not be included
in the model.
For ex. Entity ‘individual’ has attributes such as name,
address; sex (male or female); no. of dependents, etc. But
Derived attributes (such as percentage) are not
attributes.
6. Define Unique Identifier for
the Entity
This is an attribute that unambiguously identifies each
occurrence of each entity. Some entities do no have their
own identifier. These entities are qualified as dependent
or week entities. The identifier of the entity to which the
dependent entities are associated with must be used to
uniquely identify their occurrences.
For ex., a ‘child’ entity must have his or her parent
identifier to be uniquely identified.
7. Assign Cardinalities & Set Up
Relationships Between the
Entities
Cardinality is the minimum and maximum number of
times an occurrence of an entity occurs in relationship
to another entity.
The minimum number: 0 or 1
The maximum number: 1 or M
There are three types of associations:
One-to-one; one-to-many; many to many.
Types of Relationships By Degree (#
of Attributes the Relation Contains)
The relationships (representing business rules)
could be either as binary, recursive, or ternary.
Binary
Doctor
Patient
Part
Order
Ternary
Part/Order
Recursive
Organization
Recursive Association
A recursive association is one in which there is a relationship between
An entity and itself.
Information Dev
Accounting
Payable
Engineering
Receivable
Facility
Product
Nuclear
Coal
8. Validate the Model with the
Users

The approach requires that the users be involved at
the outset of the data modeling activity until the end
of its implementation. They work side-by-side with
system developers, providing input and validating
the accuracy of the data requirements. This will
ensure that the delivered end-product meets the
users’ need.
Validate the Model with the Users
–
To identify and document the integrity
constraints given in the user’s view of the
enterprise. This includes identifying:
»
Required data
»
Referential integrity
»
Attribute domain constraints
»
Enterprise constraints
»
Entity integrity
Validate the Model With
These Check Points






Attribute allocation
Rules followed
Cardinality necessity
Relationship necessity
Achievement of organization goals
Contributions to expected benefits
Topic 5: Logical Data Modeling:
What, Why, When, Who? And
Activity Description
What is it?



It is a representation of data required to support the
complete business needs for a particular business
area, system or project.
It is a set of data models that provides a framework
for the physical database construction activities.
It is a graphical representation of data objects that
shows the relationship between the tables, views
and functional core services used by modules in the
application system.
Logical Data Modeling:
What, Why, When, Who?
Why do it?




Document the type of data which must be
represented in a system with regard to specific
system applications, organizations, or technologies.
Assist in the orderly creation of a physical database
design.
Specifically describe the unique business
enterprise.
Accelerate and clarify communications between the
functional analysis and DBA’s.
Logical Data Modeling:
What, Why, When, Who?
When should it be done?



Part of the system application lifecycle.
In parallel with process modeling activities.
Upon the completion of the conceptual data model
to produce a first-cut database design that includes
definitions of tables, columns, and constraints.
Logical Data Modeling:
What, Why, When, Who?
Who should do it?

The group responsible for ensuring that data
structure reflects business data requirements.

It should be a joint effort between the functional
analysts and data administrators.
Logical Data Modeling:
What, Why, When, Who?
Benefits

Provide a definition of the data architecture of how
the target system will be implemented.

Model parts of the database schema that show how
data structures are related to the processes.

Provide program designers with the detail for the
part of the database design that their modules use.
Logical Data Modeling:
Activity Description
1.
2.
Define Data Architectural Standards
Position the conceptual data modeling and
revise the definitions of the entities.
3. Define integrity rules for entities and
relationships and Apply normalization rules to
each entity.
4. Complete and standardize the data elements.
5. Package the model for physical data modeling and
system construction.
6. Evaluate quality of data for conversion.
7. Validate and Verify the Data Model
1. Define Data Architectural
Standards
•
•
•
Data Access/retrieval guidelines
Naming convention; SQL coding standards;
Data integrity (package; trigger;commit; rollback)
Error handling; record locking rule; update
collision.
Data Security
Data access rule; data separation rule;
Data recovery/backup
Data base refresh/performance tuning
2. Position the Conceptual Data
Modeling and Revise the
Definitions of the Entities

Position the conceptual data modeling and revise
the definitions of the entities. By taking
architectural standards into consideration, the
complete CDM is reexamined. As a result, new
entities and relationships my be discovered.

Partition data into entities at the table level.
Continued

Map Local Conceptual Data Model to Local Logical
Data Model
 To refine the local conceptual data model to
remove undesirable features and to map this
model to a local logical data model. This
involves:
 (1)
 (2)
 (3)
 (4)
 (5)
 (6)
 (7)
Remove M:N relationships.
Remove complex relationships.
Remove recursive relationships.
Remove relationships with attributes.
Remove multi-valued attributes.
Re-examine 1:1 relationships.
Remove redundant relationships.
Removing M:N Relationship
Removing Complex
Relationship
Removing Recursive
Relationship
Removing Relationship with
Attribute
Removing Multi-valued
Attribute
Re-examine 1:1 relationships.
Remove redundant
relationships
3. Define Integrity Rules and
Apply Normalization Rules

Integrity rules for entities indicate the context in which an
entity occurrence may be created, modified, or deleted.
They also ensure that the entity is consistent with other
entities. This is accomplished by placing referential attributes
in each appropriate entity on the model.
For example, a Client (entity) holds an Account (entity). A
client cannot be deleted if at least one of his accounts has a
balance greater than 0.

Apply normalization rules to each entity.
Formalizing a One-to-one
Relationship with Referential
Attribute
Husband
*Husband name
Other attributes
Wife
Married
to
Referential Attribute
*Wife name
Other attributes
Husband name
Formalizing a One-to-Many
Relationship with Referential
Attribute
Dog
* Dog Id
Other attributes
Dog Owner Id
Dog Owner
(1:M)
(1:1)
Referential Attribute
* Dog Owner
Dog Owner Id
Other attributes
Formalizing a Many-to-Many
Relationship with Referential
Attribute
Part
Order
*Part Id
Other attributes
*Order No
Other attributes
Referential Attributes
Order/Part
*Order No
*Part Id
Other attributes
An associative entity may
Participate in relationship
With other entity.
A Many-to-Many Relationship
A many-to-many relationships will result in the
creation of a new entity.
Order
Order #
Part
Part #
1:M
1:M
Part/Order
Part #/Order #
Referential Integrity
Three options:



Restrict: A primary key can not be deleted if there
are any dependent foreign key rows.
Cascade: Deleting a primary key row causes the
deletion of all dependent foreign key rows.
Set Null: Deleting a primary key row causes all
dependent foreign keys values to be set null.
Apply Normalization Rules
A technique to make sure the data in a logical data
models is defined once and only once. Normalization
helps minimum data redundancy, and minimize
update abnormalities. Three forms:



First Normal Form
Second Normal Form
Third Normal Form
Normalization

First Normal Form: Relationships between primary
key and each attribute must be one-to-one; ie.,
remove repeating group.

Second Normal Form: All non-key elements are
dependent upon the entire primary key rather than
any part thereof.

Third Normal Form: Elimination of the
dependence of non-key field upon any other field
excepts the primary keys.
PK: Primary Key
FK: Foreign Key
NN: No Null
ND: No duplicate
Order
Part
Relationship
Order/Part
OrderNo (PK)
part-no
Qty
PK
PK
NN
partname
ND
+
FK
FK
1
1
123
Nut
3
5
123
Bolt
First Normal Form
Item Table
Item No
Qty-Store-1
Qty-Store-2
Qty-Store-3
PK
101
3000
4000
5000
The above is an violation of first normal form
because there exists a repeated group.
Rule Number 1

For each occurrence of an entity, there is only one
and only one value for each its attributes. Attributes
with repeating values form at least one new entity.

N other words, relationship between primary key
and each attribute must be one-to-one.
Possible Solution
Store
Store
ID
PK
S1
S2
Store/Item
Store
ID
PK
ItemNo
Qty
Sold
+
FK
FK
S1
101
3000
S2
102
4000
Second Normal Form
Student/Course
Course
No
Student
No
PK
FK
ST01
ST02
Grade
Teacher
code
Course
Name
+
FK
FK
100
200
3.0
4.0
T2
T1
Math
CS
Lee
Doe
Both course name and student name should be removed because
They are not related to the entire student/course primary key.
Possible Solution
Student No
Student
Student
Name
Course No
Student/Course
Course
Name
Rule Number 2

Each attribute must be related to the entire primary
key.
Second Normal Process
Order
Order
No
PK
1
3
Part
Part
Name
PartNo
Order-Dt
Pt-price
PK
1/2/01
1
Nut
1.5
1/3/01
5
Bolts
2.0
Order/Part
Orde
r
No PK
11
3
Partno
QTY
+
1
123
5
123
How about
Putting PartName
In Order/part
Table?
Third Normal Form
COURSE
Course
Id
Course
Name
Dept -Id
Teacher
Code
Dept
Name
Teacher
Name
PK
MH400
Math
A1
CS401
DB
CS
T1
T2
Math
DOE
CS
Lee
The relationship between any two non-primary key components
must not be one-to-one. What’s wrong with the above?
Rule Number 3

The relationship between any two non-primary key
components must not be one-t-one; ie., remove
tables within tables.
The Normal Process
Order
Customer
Cust-Id
Cust-Name
PK
1
Lee
3
Sato
Order
ID
PK
1
5
Order
Cust-Id
DT
FK
1/2/ 01
1
1/5/21
3
It would be a violation of third normal form to place custname in the order table.
Why
Reasons:
1.
2.
3.
4.
One-to-one relationship between two nonprimary key columns (Cus-Id and Cust-name).
Redundancy
An update anomaly (when a customer name was
changed)
Worse yet when a new name was added (the
name could not be stored until the customer
placed at least one order)
Identify Integrity Constraints
–To
identify and document the integrity constraints
given in the user’s view of the enterprise. This
includes identifying:
»Required
data
»Referential
»Attribute
domain constraints
»Enterprise
»Entity
integrity
constraints
integrity
4. Complete and Standardize the
Data Elements
As a result of the modeling process via the preceding
procedure, an information model will emerge.
The information model can be used as input (for ex., via
Erwin Tool) to generate a data definition language (DDL)
which in turn is used as input for physical data model.
The information model (also called logical data model)
fulfills the data requirements of the system.
Complete and Standardize the
Data Elements (Continued)
For each entity, identify the associated list of
attributes. For each element, specify the
following:
»
Permitted value
»
Coding and editing rules
»
Dimensions
»
Length
»
Value
»
Frequency
5. Package the model for
physical data modeling and
system construction.
To do this, one must have:
1.
2.
3.
A normalized entity relationship diagram.
A description of table and column definitions.
A description of data architecture standards.
6. Evaluate Quality of Data
For Conversion
If the data modeling is part of re-engineering efforts,
we must also document:



Condition of the data of the existing system
Impacts on the new and enhanced system.
Conversion rules
7. Validate and Verify the
Data Model

Validation (dynamic):
Prototyping is used to validate and refine the
model.

Verification (static): Inspection or walk-through
Inspections for entity necessity, relationship
necessity, and attribute allocation.
7 Validate and Verify the
Data Model (continued)

Validate Model against User Transactions
To ensure that the logical data model supports
the transactions that are required by the user
view. (Prototyping is a good tool)

Draw Entity-Relationship Diagram
To draw an Entity-Relationship (ER) diagram
that is a logical representation of the data given
in the user’s view of the enterprise.
Validate Model Against
User Transaction

Example transactions
(a) Insert details for new members of staff.
(b) Delete details of a member of staff, given the staff
number.
Topic 6: Physical Data
Modeling-An Overview
Step 1
Translate global logical data model for target DBMS
 Step 2
Design physical representation
 Step 3
Design security mechanisms

Step 1: Translate global logical
data model for target DBMS

To produce a basic working relational database
schema from the global logical data model

Design base relations for target DBMS
 To decide how to represent the base relations we have identified
in the global logical data model in the target DBMS.

Design enterprise constraints for target DBMS
 To design the enterprise constraints for the target DBMS.
Step 2 : Design physical
representation
To determine the file organizations and access methods that
will be used to store the base relations; that is, the way in
which relations and tuples will be held on secondary storage.





2.1 Analyze transactions
2.2 Choose file organizations
2.3 Choose secondary indexes
2.4 Consider the introduction of controlled redundancy
2.4 Estimate disk space requirements
Step 2 : Design physical
representation (Continued)

2.1 Analyze transactions
 To understand the functionality of the
transactions that will run on the database and to
analyze the important transactions.

2.2 Choose file organizations
 To determine an efficient file organization for
each base relation.
Typical Disk Configuration
Analyze Transactions

For each Transaction associated with the
components of the data model (usually
predefined queries including view, trigger,
procedure, function and package), it needs
to be broken down into further smaller
units of work:
Transactions Analysis
(continued)

A. Transformation Rules: Describe the rules
(R,U,I, D) or algorithms used to transform data
received into data generated.

B. Edit and Error Rules: Define the rules
validating data received and the method of
processing erroneous data.

C. Sequence Analysis: Describe under what
conditions this transaction is performed and what
rules determine which transaction will be
performed next.
Cross-referencing
Transactions and Relations
Transactions Analysis
(continued)

D. Audit Rules: Describe the rules required to
audit the activity performed within this transaction.

E. Security Rules: Define the security required to
invoke the transaction or various facets of the
transaction.
Transactions Analysis
(continued)

F. Frequency of execution: Define the number of
times this transaction is performed in a fixed period
of time.

G. Type of transaction mode: Describe whether
the transaction is batch, on demand, or interactive.
Example - Sample
Transactions
(A)
(B)
(C)
(D)
Insert details for a new member of staff, given
the branch address.
List rental properties handled by each staff
member at a given branch address.
Assign a rental property to a member of staff,
checking that a staff member does not manage
more than 10 properties already.
List rental properties handled by each branch
office.
ER Model for Sample Transactions
showing Expected Occurrences
Analysis of Selected
Transaction C
Step 2 Design Physical
Representation (continued)

2.3 Choose secondary indexes
 To determine whether adding secondary indexes
will improve the performance of the system.

2.4 Consider the introduction of controlled
redundancy
 To determine whether introducing redundancy
in a controlled manner by relaxing the
normalization rules will improve the
performance of the system.
Step 2.3 Choose secondary
indexes
Data File: The file contains the logical record.
Index File: The file contains the index file.

The values in the index file are ordered per the
indexing field which is usually based on a single
attribute.
Indexes



Primary index: The indexing field is guaranteed to
have a unique value.
Secondary Index: An index that is defined on a nonordering field of of the data.
Clustering index: If the index field is not a key field
of the file, so that there can be more than one record
corresponding to a value of the indexing field.
Step 2.4 Consider the
introduction of controlled
redundancy




Simplified Relation with Derived Attribute
Duplicating Attribute
Setting up Lookup Table
Duplicating Foreign Key
Step 2 Design Physical
Representation (Continued)

2.5 Estimate disk space requirements
 To estimate the amount of disk space that will be
required by the database.
Step 3 Design Security
Mechanisms

3.1 Design user views
 To design the user views that were identified in Step 1 of
the conceptual database design methodology.

3.2 Design access rules
 To design the access rules to the base relations and user
views.
Use Hotel Case for illustration
Guest
Hotel_no
Guest_no
Date_from
Date_to
Room_no
Guest_no
Guest_name
Guest_address
Hotel
Hotel_No
Hotel_name
City
Registration
Room
1:1
1:M
Identifying Relationship
Room_no
Hotel_no
Type
Price
Dependent Entity (Attribute Entity)
Data Partitioning –using
Hotel as a case study
(How)
(What, Why, Who, Where)
Conceptual Level
Subjects (Hotel)
Entities
Technical considerations
Relationships
Logical Level
Data Elements
Frequencies
Data Definition Language -DDL(create, Alter, drop tables)
Data Manipulation Language (select, insert, delete, update)
Conceptual Data Modeling
(Breaking the Subject Hotel into several
entities.
Guest
Guest_no
1:M
Hotel
Hotel_No
Books
1:M
Room
1:1
Has
1:M
Identifying Relationship
Room_no
Dependent Entity (Attribute Entity)
Logical Data Modeling
Booking
Guest
Hotel_no
Guest_no
Date_from
Date_to
Room_no
Guest_no
Guest_name
Guest_address
Hotel
Hotel_No
Hotel_name
City
Room
1:1
1:M
Identifying Relationship
Room_no
Hotel_no
Type
Price
Dependent Entity (Attribute Entity)
Resolving Referential
Attributes & Normalization
Item (such as TV, Bed)
Item No
Qty_Hotel
_no-1
Qty_ Hotel
_no -2
Qty_ Hotel
_no -3
PK
101
6
9
14
The above is an violation of first normal form because there
exists a repeated group.
Relationships between primary key and each attribute must
be one-to-one.
Possible Solution
Hotel
Hotel
ID
Hotel/Item
Hotel
name
PK
PK
H1
H2
Hotel
ID
Min-nan
Xiamen
ItemNo
+
FK
FK
H1
101(TV)
H2
Qty
102
6
5
Second Normal Form
Room/Hotel
Room
No
101
Hotel
No
4
Type
double
Price
Hotel name
100
Hotel Name should be removed because
it is not related to the entire room/hotel primary key.
Xiamen
Third Normal Form

The relationship between any two nonprimary key components must not be one-tone; ie., remove tables within tables.
Third Normal Form
City
City
ID
Hotel/City
City
name
C2
Hotel name
City
Id
PK
PK
C1
Hotel
ID
Las
Vegas
Seattle
H0
Circus
C1
H1
Flemingo
C1
H2
Holiday
C2
What happens if a new City name is added to the Hotel/City Table?
What’s the Referential Integrity
Hotel vs Room Tables?

Restrict

Cascade

Set Null
Physical Data Modeling
• Data Definition Language -DDL(create, Alter, drop
tables)

Data Manipulation Language (select, insert, delete,
update)
Data Manipulation Language



SELECT DISTINCT COUNT(Guest_No)
FROM Booking
WHERE Date_From > 08/01/2000 AND <
08/31/2000);
Data Definition Language -DDL




CREATE TABLE hotel (
hotel_no
char(18) NOT NULL,
hotel_name
char(18) NULL
)
Note: Primary key is not null.
Topic 7: Prototyping
Purposes:
To validate and refine the model of a
system.
Characteristics: Prototyping is a discipline of
interactive experimentation to stimulate user
feedback.
Approach: A prototype is a materialization of
modeling process by building rapidly and simulating
the essential aspects of the system.
Software Modeling/Prototyping With
Built-in Testing Validation/Project Evaluation &
Control
Existing System
Target System
Physical Model
Physical Model
Built-In Testing
Validation
Prototyping
Project Evaluation/Control
Logical Model
Logical Model
Approach: Structured or Spiral or Iterative Approach
Rapid Application Development
CASE tools; Deliverable Templates
Types of Prototyping

Nonworking Scale Prototyping (Mock-up)

Straw man (exploratory) prototyping (scale model
approach)

First Full-Scale Prototyping (Real world model)
Types of Prototyping

Evaluative (mock-up) prototyping (Blue print
approach)
Use: Visualization; demonstration; inspection
Advantages. Reduce risk; low cost; resolve
uncertainty
Early; fast; flexible
How: Focuses on a relatively small number of
questions to avoid becoming too complex.
Generally thought of a throw-away.
Types of Prototyping

Straw man (exploratory) prototyping (scale model
approach)
Use: Experiment; validating and refining the
conceptual as well as logical data model.
Advantages. Used to improve design; discover
things about a proposed system that would
otherwise not be revealed.
How: Evaluate the impact on work flows; validate
ease of use.
Types of Prototyping

Evolution prototyping (real world model)
Use: Production use.
Advantages. Part of production exercise.
How: Explore functionality of subsystem and
system; probe technical feasibility the performance
and the integrity of the system.
Stages of Prototyping

Planning

Initial analysis and design

Construction

Tryout

Evaluation

Disposal
Dimensions of Prototyping
Relationships of any prototyping to its eventual
system are characterized along four dimensions:

Focus – for example, a prototype may focus on the
functionality, user interface, system integration,
reliability, and performance.

Scope– Is a measure of how much of the eventual
system the prototype represents.
Dimensions of Prototyping

Depth – is a measure of how deeply it represents
the behavior of the eventual system. For ex., a
shallow prototype of a message system might
display ‘canned’ messages, where a deeper
prototype might actually perform communications
to provide a more realistic surrogate for the
eventual system.

Scale– Is a measure of how its size or performance
compares with that of the eventual system.
Use Hotel case for illustration
Guest
Hotel_no
Guest_no
Date_from
Date_to
Room_no
Guest_no
Guest_name
Guest_address
Hotel
Hotel_No
Hotel_name
City
Room
1:1
1:M
Identifying Relationship
Room_no
Hotel_no
Type
Price
Dependent Entity (Attribute Entity)
Guest Registration
Guest Registration
Prototyping for Hotel Case



Straw man (exploratory) prototyping (scale model
approach)-use Microsoft Access to build a form to
simulate the registration of guests.
Straw man (exploratory) prototyping (scale model
approach)- build a small set of tables to accept the
data.
Evolution prototyping (real world model) –actually
building a mini-system with real tables to simulate
the system.
Rapid Application
Development (RAD)

RAD, or rapid application development, is
an object-oriented approach to systems
development that includes a method of
development as well as software tools
RAD Phases

There are three broad phases to RAD:
 Requirements planning
 RAD design workshop
 Implementation
Requirements Planning Phase
Users and analysts meet to identify
objectives of the application or system
 Oriented toward solving business problems

RAD Design Workshop
 Design
and refine phase
 Use group decision support systems to help
users agree on designs
 Programmers and analysts can build and show
visual representations of the designs and
workflow to users
 Users respond to actual working prototypes
 Analysts refine designed modules based on user
responses
Implementation Phase


As the systems are built and refined, the new
systems or partial systems are tested and introduced
to the organization
When creating new systems, there is no need to run
old systems in parallel
RAD and the SDLC



RAD tools are used to generate screens and exhibit
the overall flow of the application
Users approve the design and sign off on the visual
model
Implementation is less stressful since users helped
to design the business aspects of the system
When to Use RAD

RAD is used when
 The team includes programmers and analysts
who are experienced with it
 There are pressing reasons for speeding up
application development
 The project involves a novel ecommerce
application and needs quick results
 Users are sophisticated and highly engaged with
the goals of the company
Using RAD Within the SDLC


RAD is very powerful when used within the SDLC
It can be used as a tool to update, improve, or
innovate selected portions of the system
Disadvantages of RAD




May try and hurry the project too much
Loosely documented
May not address pressing business problems
Potentially steep learning curve for programmers
inexperienced with RAD tools
Final Words

Transform data into information by
understanding the process

Transform information into decisions with
knowledge
 Transform
actions
decisions into results with