Transcript Lec. notes

ISOM
MIS3150 Data and Information
Management
Lecture 2 - Data and Process Modeling
Arijit Sengupta
Structure of this semester
ISOM
MIS3150
1. Design
0. Intro
Database
Fundamentals
Conceptual
Modeling
Relational
Model
2. Querying
Query
Languages
Advanced
SQL
3. Advanced
Topics
Transaction
Management
4. Applications
Java DB
Applications –
JDBC
Data
Mining
Normalization
Newbie
Users
Designers
Developers
Professionals
Today’s Buzzwords
ISOM
•
•
•
•
•
•
•
Data Modeling
Process Modeling
Data Flow Diagrams
Entity-Relationship Models
Cardinality and Participation Constraints
Weak Entities
Generalization Hierarchies
So, where are we?
ISOM
Proposal Analysis
Requirements
Normalization
Modeling
Schema design
Design
Tables
Indexes
Queries
Optimization
Implementation
Testing
Installation
Objectives of this lecture
ISOM
• Describe the process inherent in a system
• Present a system process in a concise
diagrammatic form
• Describe the system data in terms of
conceptual objects and relationships
between them
• Translate such conceptual descriptions into
actual tables
Benefits of Conceptual Design
ISOM
• Projects without a strong conceptual design
are more likely to fail
• Design is one of the most important aspects
of project and business process quality
management standards:
 ISO 9000
 CMM
• Designs are typically network structured, not
flat like databases
• Literature in Relational Model shows Benefits
of Conceptual Design in user performance
Database Modeling
ISOM
• Process Models




Overview of process components
Inputs and outputs of different processes
Data sources and destinations
Mode of data flow between processes
• Data Models



Model only the data, no process
Different components of the data
Relationships between primary data
components
Models, method, and media
ISOM
• A model
 describes business or organization
 separates operation from technology
• Good modeling requires good methodologies
 encompass data, process, decisions
 richly expressive and provide for levels of analysis
 simple representation
• Modeling medium
 both formal and visual
Data Flow medium
ISOM
• Notation:
 Source: box
 Process (transform): box with rounded corners
 File (store): box open on right
 Destination: box
 Flow: arrow
• Structure:
 “Explosion” of processes (recursion on structure)
Data Flow Diagrams
ISOM
DFD rules
ISOM
• Start with a very basic overview of complete
process, showing only the most important
processes, sources, destinations, and files
• Recursively “explode” each of the processes
(note: processes only!):
 preserve inputs and outputs
 preserve file accesses
 new processes, files and sources/destinations can
be created, but cannot be used from previous
levels if not directly used in the previous level
Overview of Data Modeling
ISOM
• Conceptual design: (ER Model is used at this
stage.)





What are the entities and relationships in the
enterprise?
What information about these entities and
relationships should we store in the database?
What are the integrity constraints or business rules
that hold?
A database `schema’ in the ER Model can be
represented pictorially (ER diagrams).
Can map an ER diagram into a relational schema.
ER Model Basics
ISOM
• Entity: Real-world object distinguishable from
other objects. An entity is described (in DB)
using a set of attributes.
• Entity Set: A collection of similar entities. E.g.,
all employees.



All entities in an entity set have the same set of
attributes. (Until we consider ISA hierarchies,
anyway!)
Each entity set has a key.
Each attribute has a domain.
ssn
name
Employees
dob
Alternative Entity Representations
ISOM
Employee
-----------------------SSN
Name
Dob
Employee
Employee
SSN
Name
Dob
SSN
Name
Dob
ER Model Basics (Contd.)
ISOM
• Relationship: Association among two or more entities. E.g.,
Attishoo works in Pharmacy department.
• Relationship Set: Collection of similar relationships.

An n-ary relationship set R relates n entity sets E1 ... En; each
relationship in R involves entities e1 E1, ..., en En
• Same entity set could participate in different relationship sets,
or in different “roles” in same set.
name
ssn
salary
since
name
ssn
dname
salary
Employees
did
Works_In
budget
Departments
Employees
supervisor
subordinate
Reports_To
Model this
ISOM
An auto repair shop provides services to vehicles brought in
by customers. A customer may own multiple vehicles that
they bring in for service. Each service request is assigned to
a technician. A service consists of different jobs that are
assigned fees. A service may need parts as well. The
customer is given an invoice with details on all the fees and
parts costs.
1.What should be modeled?
2.Which items should be modeled as entities?
3.Which items should be modeled as relationships?
4.Which items should be modeled as attributes?
A thumb rule to modeling
ISOM
• Major nouns become entities
• Minor nouns become attributes
• Verbs connecting major nouns
become relationships
Major nouns in our passage?
ISOM
Minor nouns in our passage?
ISOM
Verbs in our passage?
ISOM
ER model for our exercise
ISOM
Business Rules
ISOM
• A department must have one and
only one manager
• A manager may manage multiple
departments
• An employee works in only one
department
• A department (of course) has many
employees
Participation Constraints
ISOM
• Does every department have a manager?

If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (vs. partial).
• Every did value in Departments table must appear in a
row of the Manages table (with a non-null ssn value!)
since
name
ssn
Employee
dname
salary
0,M
did
Manages
budget
1,1
Department
1,1
Works_In
since
1,M
Structural Constraints
ISOM
• Participation
 Do all entity instances participate in at least
one relationship instance?
• Cardinality
 How many relationship instances can an
entity instance participate in?
(min,max)
(min,max)
Participation
Cardinality
0 -- Partial
1 -- one
1 -- Total (Mandatory) M -- more than one
Understanding P/C constraints
ISOM
0:M
Employee
1:1
manages
works_in
1:1
Department
1:M
John
Mary
Susan
Accounting
Sales
Jack
Peter
Sally
Development
Many-Many relationships
ISOM
Student
0:M
takes
Course
0:M
John
Mary
Susan
MIS415
MIS215
Jack
Peter
Sally
MIS345
MIS490
Alternative Approaches
ISOM
• Arity approach
• Crow’s foot approach (as in book)
• Minmax approach
• For this class, use ONLY the
Participation-Cardinality
approach – this is what will be
used in assignments and exams
Back to our Auto Service Example
ISOM
• What are the participation/cardinality
constraints of the relationships?
• Owns • Assigned to • Consists of • Needs part –
• ?
Weak Entities
ISOM
• A weak entity can be identified uniquely only by
considering the primary key of another (owner) entity.


Owner entity set and weak entity set must participate in a oneto-many relationship set (one owner, many weak entities).
Weak entity set must have total participation in this identifying
relationship set.
name
salary
ssn
Employees
0:M
cost
Policy
pname
1:1
age
Dependents
Point to ponder
ISOM
• Is there a weak entity in the auto
service example?
ISA (`is a’) Hierarchies
ISOM
name
ssn
Employees
hourly_wages
in C++, or other
PLs, attributes are
inherited.
If we declare A ISA B,
every A entity is also
considered to be a B
contractid
entity.
lot
hours_worked
Hourly_Emps
As
Contract_Emps
• Overlap constraints: Can Joe be an Hourly_Emps as well as a
Contract_Emps entity? (Allowed/disallowed)
• Covering constraints: Does every Employees entity also have
to be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
• Reasons for using ISA:
 To add descriptive attributes specific to a subclass.
 To identify entitities that participate in a relationship.
Stop and think
ISOM
• Is there an IS-A hierarchy in the auto
service example?
• What would it do to the design?
Conceptual Design Using the
ER Model
ISOM
• Design choices:



Should a concept be modeled as an entity or an attribute?
Should a concept be modeled as an entity or a relationship?
Identifying relationships: Binary or ternary? Aggregation?
• Constraints in the ER Model:


A lot of data semantics can (and should) be captured.
But some constraints cannot be captured in ER diagrams.
Entity vs. Attribute
ISOM
• Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
• Depends upon the use we want to make of address
information, and the semantics of the data:
 If we have several addresses per employee,
address must be an entity (since attributes cannot
be set-valued).
 If the structure (city, street, etc.) is important, e.g.,
we want to retrieve employees in a given city,
address must be modeled as an entity (since
attribute values are atomic).
Converting model to design
ISOM
• Many-to-many relationships
 Each entity becomes a table
 The relationship becomes a table
 PKs of entities becomes FKs in the
relationship
StudentID
Name
Class
Major
Student
 Student(
 Course(
 Takes(
0:M
0:M
takes
Course
semester
)
)
)
Courseno
Coursename
Credits
Model to design (contd.)
ISOM
• 1-Many relationships
 Entities become tables
 Copy PK of multi-participant to single
participant
 Copy attributes of relationship to single
participant (why?)
ComputerID
Make
Model
Year
Computer
1:M
includes
installdate
0:1
Part
Partno
Type
Make
Model to design (contd.)
ISOM
• 1-1 relationships
 Entities can be merged, or
 copy PK of any entity to the other
• Generalization
 Copy PK of parent entity to child entity as FK,
as well as PK
• Weak entities
 Copy PK of controlling entity to weak entity
as FK as well as part of PK
Lets convert our autoservice
ISOM
Summary of Conceptual Design
ISOM
• Conceptual design follows requirements analysis,

Yields a high-level description of data to be stored
• ER model popular for conceptual design

Constructs are expressive, close to the way people think
about their applications.
• Basic constructs: entities, relationships, and
attributes (of entities and relationships).
• Some additional constructs: weak entities, ISA
hierarchies, and aggregation.
• Note: There are many variations on ER model.
Summary of ER (Contd.)
ISOM
• Several kinds of integrity constraints can be
expressed in the ER model: key constraints,
participation constraints, and
overlap/covering constraints for ISA
hierarchies. Some foreign key constraints are
also implicit in the definition of a relationship
set.


Some constraints (notably, functional
dependencies) cannot be expressed in the ER
model.
Constraints play an important role in determining
the best database design for an enterprise.
Summary of ER (Contd.)
ISOM
• ER design is subjective. There are often many ways
to model a given scenario! Analyzing alternatives can
be tricky, especially for a large enterprise. Common
choices include:
• Entity vs. attribute, entity vs. relationship, binary or nary relationship, whether or not to use ISA
hierarchies
• Ensuring good database design: resulting relational
schema should be analyzed and refined further. FD
information and normalization techniques are
especially useful.
Class Exercise
ISOM
Design an ER Model for a hospital system, with the following case
description. Add other assumptions as needed.
The hospital database stores data about patients, their admission
and discharge from hospital’s departments and their treatments.
For each patient, we know the name, address, sex, social security
number. For each department we know the department’s name, its
location, the name of the doctor who heads it, the number of beds
available, and the number of beds occupied. A doctor may work in
several departments, but may only be the head in one department.
Each patient goes through multiple treatments during
hospitalization; for each treatment we store its name, duration and
the possible reactions to it that the patient may have. A treatment
may have one or more follow-up treatments.
Items to ponder:
What other constraints can we apply on this model?