Transcript Pclec05

Some Aspects of Database Design
DBDesign 5 / 1
5. Aims of Database Design
Reduce data redundancy.
Provide stable data structures that can be readily changed
with changing user requirements.
Allow users to make ad hoc requests for data.
Maintain complex relationships between data elements.
Support a large variety of decision needs.
DBDesign 5 / 2
A Practical Development Scenario
In 198n , a Company decided to develop and introduce a
Payroll system using database technology. It looked this this
Payroll System
Payroll
Data
DBDesign 5 / 3
A Practical Development Scenario
The Company grew in numbers and expertise and
decided to introduce a Personnel System. The ‘new’
design was this
Payroll System
System
Payroll
Payroll
Data
Personnel System
Personnel
Data
DBDesign 5 / 4
A Practical Development Scenario
In the next few years, these components were added
Payroll System
System
Payroll
Payroll
Data
Job History
Employee
Tracing Data
Personnel System
Personnel
Data
Labor Distribution
Labor
Analysis Data
DBDesign 5 / 5
A Practical Development Scenario
Notice the complex nature of related data
The design of the databases must be such
that data can be accessed :
Correctly
Completely
Clearly
Without Anomalies
Access = Insert, update, modify, delete and retrieve
DBDesign 5 / 6
Design Criteria
- data availability
- data reliability
- data currency
- data consistency
- data flexibility
- data efficiency
Verify these criteria are satisfied via technical review
DBDesign 5 / 7
End Product of DataBase Design
A database which will:
Accurately reflect the ‘real world’ data in all
required aspects
Be responsive to Management Information demands
Reflect Business Rules and Controls
Be capable of modification to meet changes
in Management needs - as in the previous example
Be an asset to the Organisation/Enterprise
DBDesign 5 / 8
Business Functions and Processes
• Business Functions :
Broad groups of closely related activities and decisions which
contribute to a product or service like cycle. (e.g.. planning,
materials management, production planning, quality
assurance).
• Business Processes : Decision related activities which occur
within a function. They are related to management of people,
money, material and information.
• Materials Management (Business Function)could be
subdivided into: requirements planning, purchasing, good
received, material accounting, stockkeeping
Business Processes should reflect related activity groupings
DBDesign 5 / 9
Business Activities
• Business Activities : Specific operations or transactions
required to carry out a process
Some guidelines:
An activity should produce some clearly defined (identifiable)
result - a product, a decision, a plan ......
An activity has clear boundaries - a clear beginning and end.
Activities do not overlap.
An activity is carried out as a unit, by a single agent or a
team
Once initiated, an activity proceeds independently of and
from other activities.
DBDesign 5 / 10
Business Rules and their Impact on Data
Business Rules are the defining force between application
behaviour and database content
Conditions/rules/behaviour/standards/requirements govern
what data
what domains
what processes
what relationships
what variations
may or can occur
Business rules provide a formal foundation for understanding
how a business operates
- with interesting implications for the database design
DBDesign 5 / 11
Business Rules
An example - an active employee must have a position - no
problems with this in modelling and understanding
The ‘position’ is optional (or non-allowable) for an employee
with a different status - terminated, retired, on leave
So is the relation between Employee and Position optional or
mandatory ? Or when does the relationship alter from
mandatory to optional ? Is there some ‘intelligence’ built into
the status attribute ?
DBDesign 5 / 12
Business Rules
Another one:
Automatic re-ordering in a retail business :
Rules exist to determine what levels of stock trigger off
automatic re-ordering for stock.
Is this a common rule for all stock levels ?
Are there ‘different’ rules for different items ?
Can the rules alter depending on
season
production or supply variables
competition
new lines
phasing out ‘old’ items ?
DBDesign 5 / 13
Business Rules
Some more thoughts about Business Rules:
1. They should be declarative
2. There must be only 1 interpretation
3. They must be atomic - it must be a single complete thought
4. Consistent - A set of rules must not be contradictory
5. Non-redundant - The rules must not contain rules stating the
same information
6. They must be stated in business terms
7. The rules must be stated by business people. They own the
rule and are the source of modification and deletion.
A reference: The Business Rule Book: Classifying, Defining and Modelling
Rules (Database research Group Inc.,1994)
DBDesign 5 / 14
Business Rules
In a project such as the one you are developing, the
Business Rules may seem trivial - balance ^< 0, names must
not be null, automatic date compliance ….
However in a business environment, the influence of Rules
has an impact on the data and processing of data in the
database from Conceptual stage, to implementation, to
operational use, and to termination or integration into a wider
database.
DBDesign 5 / 15
Business Rules
There are many Business Rules, and the database
designers (and the users) need to know much about them
Who is responsible for them ?
What caused the rules to be created ?
Why were they put in place ?
Have the rules changed over time ?
What is the inference of this change (s) on ‘legacy’ data ?
What business processes do they control ?
Where have they been (or will be) implemented in the
current application
DBDesign 5 / 16
Business Rules
Business rules need to be managed - and a Business Rule
database provides the means of doing this
The business rule statements can be captures and cross
referenced to the business environment
Additional detail on the Rule can be captured :
Business topic,
People, Departments … associated with the rule,
Which information systems and databases implement the
rule
DBDesign 5 / 17
Business Rules
Categories of Business Rule details
1. Basic : the actual business rule statement, its name and
any aliases, associated terms, rule management such as
versions, related change efforts, milestone dates during the
business rule’s life cycle
2. Business Information : Relevance with the business
(areas of application), primary focus and purpose, basis of
the rule. Also provides links the business rile to associated
reference material (policy manual, system documentation,
legislation - e.g. the ‘new’ privacy legislation).
DBDesign 5 / 18
Business Rules
3. Categories. The means by which the rule is classified enforcement level, type of rule, system aspect
4. Context Information : Cross references into business
systems, business workflows and the systems and
databases within the applications which support, or are
influenced by, the rule
5. Relationships with other Rules : Cross references 2
business rules - one rule may replace, conflict with, or is a
subset of another rule (wait for the ‘issue date’ of an
Insurance policy).
DBDesign 5 / 19
Business Rules
6. Parties to Rules : Identifies all parties involved with the
business rule and the roles (authorising agent, business
sponsor, enforcer, implementer, maintainer)
7. Evaluation : Use of metrics to quantify specific
characteristics about the rule, (volatility, confidence level,
sensitivity, effectiveness).
DBDesign 5 / 20
And What About the Data ?
DBDesign 5 / 21
Models
A Model is : an abstract representation a subject which looks
and behaves like all or part of the original
The model may be physical (a mock up of a container, space
shuttle, car ….)
A drawing, blueprint
It may be conceptual - mathematical model of weather
forecasting, behaviour of vehicles at speed, financial
projections
Modelling is the process of creating an abstract representation
of a subject.
DBDesign 5 / 22
Models
Models are used to
Studied more cheaply than the full scale process or subject
Time freeze activities of process - and rerun
Allow manipulation, modification, alteration of the subject
without interfering with the original subject
Some interesting examples :
turnstiles at the M.C.G and Docklands
the Tennis Stadium alterations (new roof, courts)
the City Link tunnel system
DBDesign 5 / 23
Process Modelling
Process modelling is directed at
formalising the construction of models
formalising the analysis and design of application models
The results are:
1. End-user needs are understood and documented
2. Critical issues are recognised
3. Development staff have relevant input to support the
successful completion of analysis, design and programming
steps
DBDesign 5 / 24
Data Modelling
Process Modelling reflects ‘Business Functions’ and their
Interdependencies
It does not clearly communicate the DATA which organisations
need and use
This is a database design requirement
4 reasons:
1. Processes are dynamic, data is typically static
2. Processes are more volatile than data
3. Database designers need different information than
application programmers
4. Data requires a data-oriented method of documentation
DBDesign 5 / 25
Process Modelling / Data Modelling
Process Characteristics
Dynamic (movement)
Data Characteristics
Static
Frequent definition changes
Definition changes rare
Principal use is in programming
Principal use is in database design
Documentation methods show
algorithms, movement of data,
flow of controls
Documentation methods show data
definitions and data
relationships
DBDesign 5 / 26
Data Modelling
The objectives of a Developers model are
1. To accurately and fully show facts about information and to
communicate and verify these facts with the user(s)
2. Separation of logical from physical leads to the recognition of
the data definitions an organisation requires. These are
isolated from changes in storage technology, and access
methods
3. Integration across applications (life cycle, creates, updates,
how and when data is used)
DBDesign 5 / 27
Advantages
1. Duplication of data is easier to detect
2. By separating data and use, data modelling reduces the
chance of small changes to data or processes causing major
alteration to applications and databases
3. Minimises the risk of Inconsistent processing - for example
the updating of a customer’s payment history without the
flow onto the customer’s credit status
Data modelling provides a secure basis for developers to
fully understand the dependencies of data
DBDesign 5 / 28
Terminology
A model : an abstract representation of a subject which looks or
behaves, in one or more ways, like all or part of the original
A data model : represents the definition, characterisation, and
relationships of data in a given environment
A logical data model : a model of the information used in an
organisation from the end-user perspective, without regard
for its functional or physical aspects
A database design model : a model configured to reflect data
usage in a particular physical environment
DBDesign 5 / 29
Terminology
Logical design : represents all of the phases which identify the
subject from the user perspective (planning , requirements
definition, analysis)
Physical design : represents the phases where the user’s view
of the application is converted into technical design
specifications
End users : commission the building of, and then use, the
information system
Technique : is a series of steps applied to a subject to alter its
representation. (e.g. data modelling, process modelling,
prototyping)
DBDesign 5 / 30
Terminology
Tool : a physical or conceptual construct which assists in the
application of techniques (CASE products, flow-charting
templates, DBMS, Powerpoint ….)
Methodology : is the approach made to apply one or more
techniques (e.g. Chen Entity-Relationship modelling)
There are others :
Nijssen’s Information Analysis Method (NIAM)
Semantic Object Modelling (SOM)
There is an article on these methodologies.
DBDesign 5 / 31
Terminology
The Entity-Relationship model reflects the real world which
consists of collection of basic objects.
These objects are called entities. This is an object which
exists in the real world and can be distinguished from other
objects ( such as bank accounts, persons, loans).
Entities are described by a set of attributes - such as an
account could be described by attributes such as account
number and balance.
Another entity (customer) could be described by the
attributes customer_name, address, customer_identity_
number
DBDesign 5 / 32
Terminology
A Relationship is an association among several entities (or in
a special case among attributes in a single entity (such as
the Telephone directory).
The set of all entities of the same type is known as an entity
set (what else ?)
and the set of all relationships of the same type is known as
a relationship set.
DBDesign 5 / 33
More Terminology
The Relational Model uses a collection of tables to represent
» Data
» The Relationships between the sets of data
Each table has at least 1 column (normally more than 1)
and each column has a unique name in its table
The relational model is an example of a record-based model.
The database is structured in fixed format records, generally
with many types
Each record defines a fixed number of attributes (fields). The
columns correspond to the attributes of the record type
DBDesign 5 / 34
We’re getting there
The diagram shows the relationship of a set of customers
and a set of accounts
customer
account
Although it’s a bit early, the customer set reflects the attributes
Customer_name, customer_address, and the customer_ID
The account set reflects (or could reflect) the attributes
Account_number, balance
DBDesign 5 / 35
Modelling
Here is some supporting data :
The customer table
customer_id customer_name
address
1831567
Johnson, R
19 Green Street Toorak
1723746
Norman, G
13 The Avenue Newport
1189345
Shomari, V
234 West Street Melton
The account table
Account number Balance
73145
900
45893
1200
36587
70
What is missing ?
We can’t tell which customer
owns which account
DBDesign 5 / 36
Modelling
• We need another table - depositor. This table allows us to
associate customers, accounts and account numbers.
The Depositor table
customer_id account_number
1831567
73145
1723746
45893
1189345
36587
This arrangement allows for more than one customer ‘owning’
an account
And yes, you’re correct - it’s not the only way of modelling the
sets
DBDesign 5 / 37
More Terms
In real life, modelling requires much analysis.
In the example shown, the address would be broken into
subparts - probably number, street/road, suburb, post code
The attribute name could be structured as first_name, initial,
last_name, and there would most likely be another attribute
title (Ms. Mrs. Dr. Mr. …..)
- these are known as composite attributes
DBDesign 5 / 38
Business Entities
Are persons, objects or events about which Information is,
or will be, recorded in the Information Data Base
Many of these Entities can be identified with Business
Activities (e.g. suppliers, purchase orders, customer)
And, we gain an insight into Business Rules, which
determine how data exists or co-exists with other data
or events.
DBDesign 5 / 39
Critical Success Factors
* Key factors which must be performed well to ensure the
success of an organisation
* Also known as Critical Performance Items
production failure rate < 0.01% of total production units
production cost increases <= c.p.i. increases
customer service complaints < 1% of all customer transactions
absenteeism < 1% of staff in any 24 hour period
product quality => advertised standards (water,power)
no more than 1% of trains > 3 mins late at destination
DBDesign 5 / 40
Database Design
- the process of developing database structures from
user requirements for data
Structured Methodology - a number of ordered formal
processes with known inputs and expected outputs
Objectives
1. derive relationships
2. evolve to meet user requirements
3. user requests are met within reasonable time limits
DBDesign 5 / 41
Database Design Context
System Analysis
Abstract LOGICAL from PHYSICAL
Real world data (data modelling)
Real world functions & behaviour (procedure modelling)
System Design
Transform new LOGICAL into PHYSICAL model
Design database, programs, screens etc.
Implement
Build new PHYSICAL model and install
DBDesign 5 / 42
The Stages of Database Design
Requirements Analysis
Conceptual Design
Physical Design
Implementation Design
DBDesign 5 / 43
Conceptual Design
- Create an enterprise model
- Normalise user views
- Integrate user views
- Create a database
– Review
DBDesign 5 / 44
Modelling User Requirements
DATA
Entity Relationship Modelling (Semantic)
Normalisation (Minimality)
PROCESS
Data Flow Analysis
Procedure Formation
Functional Analysis
DBDesign 5 / 45
Importance of Data
The structure of data in an information system is
fundamental to the effectiveness and scope of the system.
Data structures are more stable than the ways in which data
may be used within an organisation.
Data centered system development methods provide more
integrated information systems more quickly and efficiently
than process centered techniques.
DBDesign 5 / 46
Well Structured Data
ADVANTAGES
Better integration of applications through shared access to
common data.
More adaptable systems because files and databases will
follow the natural structures inherent in data rather than
application oriented relationships.
Processing should be simpler if data is held in its simplest
least redundant form.
Uniform application of Business Rules
DBDesign 5 / 47
Data Focussed Development
1. Develop broad data model and broad function model.
2. Partition data model by function.
3. Develop detailed data model by each functional partition.
4. Develop the detailed procedure models by functional
partition.
DBDesign 5 / 48
Entity Relationship Modelling
Represents major data objects and the relationships
between them.
Widely used commercially in Australia.
Originated by Chen in 1976 and adapted in the 'Information
Engineering' approach.
An Entity model is composed of textual descriptions of
objects and a graphical representation called an Entity
Relationship Diagram
DBDesign 5 / 49
Entity Relationship Modelling
Entity types form relationship types with one another
and each entity type may be described by one or
more attribute types.
Customer
ENTITY TYPE
Places
(Placed By)
Sales Order
RELATIONSHIP TYPE ATTRIBUTE TYPES
CUSTOMER
Any person or organisation who purchases goods
Customer Number, Customer Name, Customer Phone
DBDesign 5 / 50
Entity Type
In practice Entity Instance is often shortened to 'Instance'
and Entity Type is shortened to 'Entity'.
Entity names should be brief but unambiguous i.e. clearly
reflect the meaning .
Entities are often recognised as nouns or noun phrases.
Entities can model any object set of interest i.e. physical,
conceptual, events etc.
DBDesign 5 / 51
Relationship Types
A named association between two entity types.
Entity type
CUSTOMER
Places
Relationship Type
Entity Type
(Placed By)
SALES ORDER
DBDesign 5 / 52
Relationship Instances
A relationship instance is an association between two
entity instances.
CUSTOMER
SALES ORDER
Smith
Places
OrderNo 651
Jones
Places
OrderNo 375
A relationship type is a classification of relationship
instances.
Places
(Placed By)
CUSTOMER
SALES ORDER
DBDesign 5 / 53
Degree of Relationship
Binary
Binary
Employee
Order
1
1
is paid
contain
1
m
Salary
Part
1
Unary
Employee
m
Ternary
m
Computer
m
is a
leader/
member
Employee
runs on
p
n
n
is a
leader/
member
Op. System
Software
DBDesign 5 / 54
Connectivity of Relationship
One to One
One to Many
Many to Many
Employee
Customer
Supplier
Leads
Places
Supplies
(Lead by)
(Placed by)
(Supplied by)
Project
Sales Order
Item
Also known as Maximal Cardinalities
Relationships represent the 'Business Rules' of the
organisation.
DBDesign 5 / 55
Nature of Relationships
The entity participation in the relationship may be
mandatory or optional. (Minimal Cardinalities)
CUSTOMER
SALES ORDER
Places
(Placed By)
A Customer may place many Sales Orders.
A Sales Order must be placed by one Customer.
DBDesign 5 / 56
Logical Data Modelling
Modelling symbols - McFadden, Hoffer and Prescott (P 87)
Mandatory One
Mandatory Many - at least One
Optional One - No less than One
Optional Many - May be none
There is NO standard notation for E-R Modelling
DBDesign 5 / 57
Attribute Types
A data item or element that describes one entity type.
STUDENT RESULT
Studentno Unit Code
Year Semester
Grade
S1
CSE4230
2000
1
N
S1
CSE4230
2000
2
HD
PRIMARY KEY or IDENTIFIER
Some or all of the attributes describing an entity type
serve to uniquely identify each entity instance.
DBDesign 5 / 58
Value Sets or Domains
Each attribute type describes one entity type and is
based on an underlying domain or value set.
EMPLOYEE
Empno Emp_Gender
E1
Female
Emp_Name
Emp_Hire_Date
Jones
20/6/1997
DEPENDENT
Empno Depend_Name Depend_Gender
E1
Katherine
Female
Birthdate
23/6/1994
DBDesign 5 / 59
Synonyms or Aliases
Be careful to differentiate between things and their names.
Item = Part = Product ???
Synonym or Alias
The same object called different names by different people.
This problem concerns entity types, relationship types and
attribute types.
DBDesign 5 / 60
Homonyms
Different things called the same name by different people ??
Employee Start Date
In the Personnel area this may mean the date an employee
started with the company.
In a given department this may mean the date an employee
started with the department.
This problem also concerns entity types, relationships types
and attribute types.
DBDesign 5 / 61
Identifying Entity Types
Entity type names are generally nouns or noun phrases.
Can the entity type be fully defined in one or
two sentences ?
What are some typical attributes describing the entity type ?
Which attributes uniquely identify each entity instance ?
DBDesign 5 / 62
What are the Entities Here ?
ABC COMPANY SALES ORDER
Order No ____________
Order Date __ / __ / __
Customer Name
Customer Address______________________________
ItemNo Description
Qty-ord
Unit-Price
Ext-Price
DBDesign 5 / 63
Building an E-R Model
– List the major entity types in the system and note the
relationship types between them.
– Represent the entity types by named rectangles.
– Identify relationship types between entity types including
relationship degree and nature.
DBDesign 5 / 64
E-R Modelling
Database
Initial Study
Database Life Cycle
Processes and
database transactions
Data Analysis
User views and
Business Rules
Initial E-R Model
Verification
Attributes
Normalisation
Final
E-R Model
DBDesign 5 / 65
Conceptual Design Tools
Tools Available
to Designer
Entity-Relationship
Diagram
Normalisation
Data dictionary
Sources of Information
for the Designer
Conceptual
Model
Definition
and
Validation
Business rules
and data constraints
Data flow diagrams
Process functional
descriptions (user views)
(output generated by the
system analysis and design
phases)
DBDesign 5 / 66
Modelling Techniques
These are some of the methods used in data modelling
Network Model (Codasyl)
Relational Model
Entity-Set model
Entity-Relationship (based on Chen - many variations)
Smith’s Abstraction
Semantic Data Model
IDEFI
DBDesign 5 / 67
Modelling Techniques
Functional Data Model - (Dapter)
Data Designer (Automated Design Tool)
Design Manager (Automated support for describing data
models -- leads to database design ……….
And the ‘latest’ is the Unified Modelling Language (UML)
which is very much favoured with Object Oriented approach
to software development.
Terms such as Class, Object, Statechart, Activity, Sequence,
Collaboration, Use-case, Component and Deployment are
included in its use
DBDesign 5 / 68
Who is studying whom/what ?
DBDesign 5 / 75