Database Design

Download Report

Transcript Database Design

Database Design
Introduction

The chapter will address the following questions:





What are the similarities and differences between conventional
files and modern, relational databases?
What are of fields, records, files, and databases? What are some
examples of each?
What is a modern data architecture that includes files, operational
databases, data warehouses, personal databases, and work group
databases?
What are the similarities and differences between the roles of
systems analyst, data administrator, and database administrators as
they relate to databases?
What is the architecture of a database management system?
1
Database Design
Introduction

The chapter will address the following questions:




How does a relational database implement entities, attributes, and
relationships from a logical data model?
How do you normalize a logical data model to remove impurities
that can make a database unstable, inflexible, and non-scaleable?
How do you transform a logical data model into a physical,
relational database schema?
How do you generate SQL code to create the database structures in
a schema?
2
Database Design
Conventional Files Versus the Database

Introduction

All information systems create, read, update and delete data. This
data is stored in files and databases.
 Files are collections of similar records.
 Databases are collections of interrelated files.
• The key word is interrelated.
• The records in each file must allow for relationships (think of them
as ‘pointers’) to the records in other files.


In the file environment, data storage is built around the
applications that will use the files.
In the database environment, applications will be built around the
integrated database.
3
Database Design
Information
System
File
File
Information
System
File
Information
System
Database
(consolidated &
integrated data
from files)
File
Information
System
4
Information
System
Database Design
Conventional Files Versus the Database

The Pros and Cons of Conventional Files


Pros:
 Conventional files are relatively easy to design and implement
because they are normally based on a single application or
information system.
 Historically, another advantage of conventional files has been
processing speed.
Cons:
 Duplication of data items in multiple files is normally cited as
the principal disadvantage of file-based systems.
 A significant disadvantage of files is their inflexibility and nonscaleability.
5
Database Design
Conventional Files Versus the Database

The Pros and Cons of Conventional Files

As legacy file-based systems and applications become candidates
for reengineering, the trend is overwhelmingly in favor of
replacing file-based systems and applications with database
systems and applications.
6
Database Design
Conventional Files Versus the Database

The Pros and Cons of Database

Pros:
 The principal advantage of a database is the ability to share the
same data across multiple applications and systems.
 Database technology offers the advantage of storing data in
flexible formats.
 Databases allow the use of the data in ways not originally
specified by the end-users - data independence.
 The database scope can even be extended without impacting
existing programs that use it.
• New fields and record types can be added to the database without
affecting current programs.
7
Database Design
Conventional Files Versus the Database

The Pros and Cons of Database

Cons:
 Database technology is more complex than file technology.
• Special software, called a database management system (DBMS),
is required.
A DBMS is still somewhat slower than file technology.
 Database technology requires a significant investment.

• The cost of developing databases is higher because analysts and
programmers must learn how to use the DBMS.
In order to achieve the benefits of database technology, analysts
and database specialists must adhere to rigorous design
principles.
 Another potential problem with the database approach is the
increased vulnerability inherent in the use of shared data.

8
Database Design
Conventional Files Versus the Database

Database Design in Perspective




To fully exploit the advantages of database technology, a database
must be carefully designed.
The end product is called a database schema, a technical blueprint
of the database.
Database design translates the data models that were developed for
the system users during the definition phase, into data structures
supported by the chosen database technology.
Subsequent to database design, system builders will construct
those data structures using the language and tools of the chosen
database technology.
9
Database Design
INFORMATION SYSTEMS FRAMEWORK
FOCUS ON
SYSTEM
DATA
FOCUS ON
SYSTEM
PROCESSES
FOCUS ON
SYSTEM
INTERFACES
FOCUS ON
SYSTEM
GEOGRAPHY
Business Subjects
SYSTEM
OWNERS
(scope)
Survey Phase
(establish scope and
project plan)
Custom ers order zero,
one, or m ore products.
Products m ay be ordered
by zero, one, or m ore
custom ers.
Study Phase
(establish
system
improvement
objectives)
Data Requirements
S
Y
S
T
E
M
A
N
A
L
Y
S
T
S
SYSTEM
USERS
(requirements)
PRODUCT
product-no
product-name
unit-of-measure
unit-price
quantity-available
CUSTOMER
customer-no
customer-name
customer-rating
balance-due
ORDER
order-no
order-date
products-ordered
quantities-ordered
Definition Phase
(establish and
prioritize
business system
requirements)
data models
Database Schema
SYSTEM
DESIGNERS
(specification)
PRODUCT
CUSTOMER
product_no [Alpha(10)] INDEX
customer_no [Alpha (10)] INDEX
product_name [Alpha(32)]
customer_name [Alpha(32)]
unit_of_measure [Alpha(2)]
customer_rating [Alpha(1)] INDEX
unit_pri ce [Real(3,2)]
balance_due [Real(5,2)]
quantity_available [Integ er(4)]
ORDER_PRODUCT
ORDER
order_no [Alpha(12)] INDEX ORDER. order_no
order_date [Date(mmddyyyy) PRODUCT.product_no
quantity_ordered [Integ er(2)
CUSTOMER.customer_n o
Design Phase
(translate business
requirements into a
technical design)
Database Programs
SYSTEM
BUILDERS
(components)
FAST
Methodology
CREATE TABLE CUSTOMER
(customer_no CHAR(10) NOT NULL
customer_name CHAR(32) NOT N ULL
customer _rating CHAR (1) NOT NU LL
balance_due DECIMAL(5,2)
CREATE INDEX cust_no_idx on CUSTOMER
CREATE INDEX cust_rt_idx on CUSTOMER
Existing
Databases
and
Technology
Existing
Interfaces
and
Technology
Existing
Applications
and
Technology
10
Implementation
Phase
(translate technical
design into code)
Existing
Networks
and
Technology
Database Design
Database Concepts for the Systems Analyst

Fields

Fields are common to both files and databases.
 A field is the implementation of a data attribute.
• Fields are the smallest unit of meaningful data to be stored in a file
or database.

There are four types of fields that can be stored: primary keys,
secondary keys, foreign keys, and descriptive fields.
 Primary keys are fields whose values identify one and only
one record in a file.
 Secondary keys are alternate identifiers for an database.
• A single file in a database may only have one primary key, but it
may have several secondary keys.
11
Database Design
Database Concepts for the Systems Analyst

Fields

There are four types of fields that can be stored: primary keys,
secondary keys, foreign keys, and descriptive fields. (continued)
 Foreign keys are pointers to the records of a different file in a
database.
• Foreign keys are how the database ‘links’ the records of one type
to those of another type.

Descriptive fields are any other fields that store business data.
12
Database Design
Database Concepts for the Systems Analyst

Records



Fields are organized into records.
Like fields, records are common to both files and databases.
 A record is a collection of fields arranged in a predefined
format.
During systems design, records will be classified as either fixedlength or variable-length records.
 Most database systems impose a fixed-length record
structure, meaning that each record instance has the same
fields, same number of fields, and same logical size.
 Variable-length record structures allow different records in
the same file to have different lengths.
• Database systems typically disallow (or, at least, discourage)
variable length records.
13
Database Design
Database Concepts for the Systems Analyst

Records

When a computer program ‘reads’ a record from a database, it
actually retrieves a group or block of records at a time.
 This approach minimizes the number of actual disk accesses.
 A blocking factor is the number of logical records included in
a single read or write operation (from the computer’s
perspective). A block is sometimes called a physical record.
 Today, the blocking factor is usually determined and optimized
by the chosen database technology, but a qualified database
expert may be allowed to fine tune that blocking factor for
performance.
14
Database Design
Database Concepts for the Systems Analyst

Files and Tables



Similar records are organized into groups called files.
 A file is the set of all occurrences of a given record structure.
In database systems, a file corresponds to a set of similar records;
usually called a table.
 A table is the relational database equivalent of a file.
Some of the types of files and tables include:
 Master files or tables contain records that are relatively
permanent.
• Once a record has been added to a master file, it remains in the
system indefinitely.
• The values of fields for the record will change over its lifetime, but
the individual records are retained indefinitely.
15
Database Design
Database Concepts for the Systems Analyst

Files and Tables

Some of the types of files and tables include: (continued)
 Transaction files or tables contain records that describe
business events.
• The data describing these events normally has a limited useful
lifetime.
• In information systems, transaction records are frequently retained
on-line for some period of time.
• Subsequent to their useful lifetime, they are archived off-line.

Document files and tables contain stored copies of historical
data for easy retrieval and review without the overhead of regenerating the document.
16
Database Design
Database Concepts for the Systems Analyst

Files and Tables

Some of the types of files and tables include: (continued)
 Archival files and tables contain master and transaction file
records that have been deleted from on-line storage.
• Records are rarely deleted; they are merely moved from on-line
storage to off-line storage.
• Archival requirements are dictated by government regulation and
the need for subsequent audit or analysis.

Table look-up files contain relatively static data that can be
shared by applications to maintain consistency and improve
performance.
17
Database Design
Database Concepts for the Systems Analyst

Files and Tables

Some of the types of files and tables include: (continued)
 Audit files are special records of updates to other files,
especially master and transaction files.
• They are used in conjunction with archive files to recover ``lost’’
data.
• Audit trails are typically built into better database technologies.
18
Database Design
Database Concepts for the Systems Analyst

Databases



Databases provide for the technical implementation of entities and
relationships.
The history of information systems has led to one inescapable
conclusion:
 Data is a resource that must be controlled and managed!
Out of necessity, database technology was created so an
organization could maintain and use its data as an integrated whole
instead of as separate data files.
19
Database Design
Database Concepts for the Systems Analyst

Databases

Data Architecture:
 A business’ data architecture is comprised of the files and
databases that store all of the organization’s data, the file and
database technology used to store the data, and the organization
structure set up to manage the data resource.
 Operational databases have been developed to support day-today operations and business transaction processing for major
information systems.
20
Database Design
Database Concepts for the Systems Analyst

Databases

Data Architecture:
 Many information systems shops hesitate to give end-users
access to operational databases, because the volume of
unscheduled reports and queries could overload the computers
and hamper business operations.
• To remedy that problem, data warehouses were developed.
computers.
– Data warehouses store data that is extracted from the
production databases and conventional files.
– Fourth-generation programming languages, query tools, and
decision support tools are then used to generate reports and
analyses off these data warehouses.
21
Database Design
Database Concepts for the Systems Analyst

Databases

Data Architecture:
 Personal computer and local network database technology has
rapidly matured to allow end-users to develop personal and
departmental databases.
• These databases may contain unique data, or they may import data
from conventional files, operational databases, and/or data
warehouses.
22
Database Design
Database Concepts for the Systems Analyst

Databases

Data Architecture:
 To manage the enterprise-wide data resource, a staff of database
specialists may be organized around the following
administrators:
• A data administrator is responsible for the data planning,
definition, architecture, and management.
– One or more database administrators are responsible for the
database technology, database design and construction,
security, backup and recovery, and performance tuning.
23
Database Design
Users and
Programmers
Information System
File
A legacy
file-based
information
system
Users and
Programmers
File
(built in-house)
Information
System
Information
System
Operational
Database
(built
in-house)
(built
in-house)
(built
in-house)
File
End-User
Tools
File
Data
Warehouse
End-User
Applications
File
A legacy
file-based
information
system
Users and
Programmers
Users
Personal
DB
File
(purchased)
Operational
Database
File
Information
System
(purchased)
Work-Group
Database
End-User
Work Group
Users and
Programmers
24
Database Design
Database Concepts for the Systems Analyst

Databases

Database Architecture:
 Database architecture refers to the database technology
including the database engine, database management utilities,
database CASE tools for analysis and design, and database
application development tools.
 The control center of a database architecture is its database
management system.
• A database management system (DBMS) is specialized
computer software available from computer vendors that is used to
create, access, control, and manage the database. The core of the
DBMS is often called its database engine. The engine responds to
specific commands to create database structures, and then to
create, read, update, and delete records in the database.
25
Database Design
Database Concepts for the Systems Analyst

Databases

Database Architecture:
 A systems analyst, or database analyst, designs the structure of
the data in terms of record types, fields contained in those
record types, and relationships that exist between record types.
 These structures are defined to the database management
system using its data definition language.
• Data definition language (or DDL) is used by the DBMS to
physically establish those record types, fields, and structural
relationships. Additionally, the DDL defines views of the database.
Views restrict the portion of a database that may be used or
accessed by different users and programs. DDLs record the
definitions in a permanent data repository.
26
Database Design
Programmers
Systems Analysts
and/or
Database Designers
End Users
Host-based
Transaction
Processing
Monitor
(optional)
Internal
TP Monitor
(opt)
Data
Manipulation
Language
DML
Data
Definition
Language
DDL
Proprietary Data
Manipulation
Language and/or
Report Writers
Database Management System (DBMS)
Stored Data
27
Metadata
Database Design
Database Concepts for the Systems Analyst

Databases

Database Architecture:
 Some data dictionaries include formal, elaborate software that
helps database specialists track metadata – the data about the
data –such as record and field definitions, synonyms, data
relationships, validation rules, help messages, and so forth.
 The database management system also provides a data
manipulation language to access and use the database in
applications.
• A data manipulation language (or DML) is used to create, read,
update, and delete records in the database, and to navigate between
different records and types of records. The DBMS and DML hide
the details concerning how records are organized and allocated to
the disk.
28
Database Design
Database Concepts for the Systems Analyst

Databases

Database Architecture:
 Many DBMSs don’t require the use of a DDL to construct the
database, or a DML to access the database.
• They provide their own tools and commands to perform those
tasks. This is especially true of PC-based DBMSs.
Many DBMSs also include proprietary report writing and
inquiry tools to allow users to access and format data without
directly using the DML.
 Some DBMSs include a transaction processing monitor (or
TP monitor) that manages on-line accesses to the database, and
ensures that transactions that impact multiple tables are fully
processed as a single unit.

29
Database Design
Database Concepts for the Systems Analyst

Databases

Relational Database Management Systems:
 There are several types of database management systems and
they can be classified according to the way they structure
records.
 Early database management systems organized records in
hierarchies or networks implemented with indexes and linked
lists.
 Relational databases implement data in a series of tables that
are ‘related’ to one another via foreign keys.
• Files are seen as simple two-dimensional tables, also known as
relations.
• The rows are records.
• The columns correspond to fields.
30
Database Design
Customer
places
Order
sells
31
Ordered
Product
sold on
Product
Database Design
Customers Table
Customer Number
Customer Name
10112
10113
10114
10117
Luck Star
Pemrose
Hartman
K-Jack Industries
Customer
Balance
…
1455.77
12.14
0.00
- 20.00
Orders
Table
Order
Number
Customer Number
(foreign key)
A633
A634
A635
10112
10114
10112
…
Ordered Products Table
Order
Number
(foreign
key)
Product Number
(foreign key)
Quantity
Ordered
A633
A633
A634
A634
A635
A635
77F02
77B12
77B13
77F01
77B12
77B15
1
500
100
5
300
15
…
Products Table
Product Number
Product Description
Quantity
in Stock
77B12
77B13
77B15
77F01
77F02
Widget
Widget
Widget
Gadget
Gadget
8000
0
52
20
2
32
…
Database Design
Database Concepts for the Systems Analyst

Databases

Relational Database Management Systems:
 Both the DDL and DML of most relational databases is called
SQL (which stands for Structured Query Language).
• SQL supports not only queries, but complete database creation and
maintenance.
• A fundamental characteristic of relational SQL is that commands
return ‘a set’ of records, not necessarily just a single record (as in
non-relational database and file technology).
33
Database Design
Database Concepts for the Systems Analyst

Databases

Relational Database Management Systems:
 High-end relational databases also extend the SQL language to
support triggers and stored procedures.
• Triggers are programs embedded within a table that are
automatically invoked by a updates to another table.
• Stored procedures are programs embedded within a table that can
be called from an application program.

Both triggers and stored procedures are reusable because they
are stored with the tables themselves.
• This eliminates the need for application programmers to create the
equivalent logic within each application that use the tables.
34
Database Design
Data Analysis for Database Design

What is a Good Data Model?



A good data model is simple.
 As a general rule, the data attributes that describe an entity
should describe only that entity.
A good data model is essentially non-redundant.
 This means that each data attribute, other than foreign keys,
describes at most one entity.
A good data model should be flexible and adaptable to future
needs.
 We should make the data models as application-independent as
possible to encourage database structures that can be extended
or modified without impact to current programs.
35
Database Design
Data Analysis for Database Design

Data Analysis

The technique used to improve a data model in preparation for
database design is called data analysis.
 Data analysis is a process that prepares a data model for
implementation as a simple, non-redundant, flexible, and
adaptable database. The specific technique is called
normalization.
• Normalization is a technique that organizes data attributes such
that they are grouped together to form stable, flexible, and adaptive
entities.
36
Database Design
Data Analysis for Database Design

Data Analysis

Normalization is a three-step technique that places the data model
into first normal form, second normal form, and third normal form.
 An entity is in first normal form (1NF) if there are no
attributes that can have more than one value for a single
instance of the entity.
 An entity is in second normal form (2NF) if it is already in
1NF, and if the values of all non-primary key attributes are
dependent on the full primary key – not just part of it.
 An entity is in third normal form (3NF) if it is already in 2NF,
and if the values of its non-primary key attributes are not
dependent on any other non-primary key attributes.
37
Database Design
Data Analysis for Database Design

Normalization Example

First Normal Form:
 The first step in data analysis is to place each entity into 1NF.
38
Database Design
sold
PRODUCT
------------Key Data---------------Product-Number (PK1)
Universal-Product-Code (PK2)
--------Non-Key Data------------Quantity-in-Stock
Product-Type
Suggested-Retail-Price
Club-Default-Unit-Price
Current-Special-Unit-Price
Current-Month-Units-Sold
Current-Year-Units-Sold
Total-Lifetime-Units-Sold
MEMBER ORDER
------------------Key Data--------------------Order-Number (PK)
----------------Non-Key Data----------------Order-Creation-Date
Order-Automatic-Fill-Date
Member Number (FK1)
Member-Name
Member-Address
Shipping-Address
Shipping Instructions
Club-Name (FK2)
Promotion-Number (FK2)
0 { Ordered-Product-Description } n
0 { Ordered-Product-Title } n
1 { Quantity-Ordered } n
1 { Purchased-Unit-Price } n
1 { Extended-Price } n
Order-Sub-Total-Cost
Order-Sales-Tax
Ship-Via-Method
Shipping-Charge
Order-Status
Prepaid-Amount
Method-of-Payment
placed
MEMBER
---------------------Key Data---------------------Member-Number (PK1)
------------------Non-Key Data------------------Member-Name
Member-Status
Member-Street-Address
Member-Daytime-Phone-Number
Date-of-Last-Order
Member-Balance-Due
Member-Bonus-Balance-Available
Member-Credit-Card-Information
1 { Club-Name } n
1 { Agreement-Number } n
1 { Taste Code } n
1 { Media Preference } n
1 { Date-Enrolled } n
1 { Expiration-Date } n
1 { Number-of-Credits-Required } n
1 { Number of Credits-Earned } n
enrolls in
CLUB
------------------Key Data---------------------Club-Name (PK)
--------------Non-Key Data-------------------Club-Description
Club-Charter-Date
1 { Agreement-Number } n
1 { Agreement-Active-Date } n
1 { Agreement-Expiration-Date } n
1 { Obligation-Period } n
1 { Required-Number-of-Credits } n
1 { Bonus-Credits-After-Obligation } n
sponsors
is a
generates
MERCHANDISE
-------------Key Data--------------Product-Number (PK1)
Universal-Product-Code (PK1)
---------Non-Key Data-----------Merchandise-Name
Merchandise-Description
Merchandise-Size
Merchasnise-Color
Unit-of-Measure
TITLE
--------------Key Data-------------Product-Number (PK1)
Universal-Product-Code (PK2)
----------Non-Key Data----------Title-of-Work
Title-Cover
Catalog-Description
Copyright-Date
Entertainment-Category
Credit-Value
features
is a
AUDIO TITLE
-------------Key Data--------------Product-Number (PK1)
Universal-Product-Code (PK1)
---------Non-Key Data-----------Artist
Audio-Category
Audio-Sub-Category
Number-of-Units-in-Package
Audio-Media-Code
Content-Advisory-Code
VIDEO TITLE
-------------Key Data--------------Product-Number (PK1)
Universal-Product-Code (PK1)
---------Non-Key Data-----------Producer
Director
Video-Category
Video-Sub-Category
Closed-Captioned
Language
Running-Time
Video-media-Type
Video-Encoding
Screen-Aspect
MPA-Rating-Code
GAME TITLE
-------------Key Data--------------Product-Number (PK1)
Universal-Product-Code (PK1)
---------Non-Key Data-----------Manufacturer
Game-Category
Game-Sub-Category
Game-Platform
Game-Media-Type
Number-of-Players
Parent-Advisory-Code
39
PROMOTION
---------Key Data------------Club-Name (PK1)
Promotion-Number (PK1)
-------Non-Key Data-------Product-Number (FK1)
Promotion-Release-Date
Promotion-Status
Promotion-Type
Automatic-Fill-Delay
Database Design
MEMBER ORDER (1NF)
------------------Key Data--------------------Order-Number (PK)
----------------Non-Key Data----------------Order-Creation-Date
Order-Automatic-Fill-Date
Member Number (FK1)
Member-Name
Member-Address
Shipping-Address
Shipping Instructions
Club-Name (FK2)
Order-Sub-Total-Cost
Order-Sales-Tax
Ship-Via-Method
Shipping-Charge
Order-Status
Prepaid-Amount
MEMBER ORDER (unnormalized)
------------------KeyData--------------------Order-Number (PK)
---------------Non-Key Data----------------Order-Creation-Date
Order-Automatic-Fill-Date
Member Number (FK1)
Member-Name
Member-Address
Shipping-Address
Shipping Instructions
Club-Name (FK2)
Promotion-Number (FK2)
0 { Ordered-Product-Description } n
0 { Ordered-Product-Title } n
1 { Quantity-Ordered } n
1 { Purchased-Unit-Price } n
1 { Extended-Price } n
Order-Sub-Total-Cost
Order-Sales-Tax
Ship-Via-Method
Shipping-Charge
Order-Status
Prepaid-Amount
Method-of-Payment
sells
CORRECTION
MEMBER ORDERED PRODUCT (1NF)
---------------Key Data-----------------Member-Number (PK1) (FK)
Product-Number (PK1) (FK)
-------------Non-Key Data------------Ordered-Product-Description
Ordered-Product-Title
Quantity-Ordered
Purchased-Unit-Price
Extended-Price
sold as
PRODUCT (1NF)
------------Key Data---------------Product-Number (PK1)
Universal-Product-Code (PK2)
--------Non-Key Data------------Quantity-in-Stock
Product-Type
Suggested-Retail-Price
Club-Default-Unit-Price
Current-Special-Unit-Price
Current-Month-Units-Sold
Current-Year-Units-Sold
Total-Lifetime-Units-Sold
40
Database Design
CLUB (1NF)
------------------Key Data---------------------Club-Name (PK)
--------------Non-Key Data-------------------Club-Description
Club-Charter-Date
establishes
CLUB (unnormalized)
------------------Key Data---------------------Club-Name (PK)
--------------Non-Key Data-------------------Club-Description
Club-Charter-Date
1 { Agreement-Number } n
1 { Agreement-Active-Date } n
1 { Agreement-Expiration-Date } n
1 { Obligation-Period } n
1 { Required-Number-of-Credits } n
1 { Bonus-Credits-After-Obligation } n
CORRECTION
41
AGREEMENT (1NF)
----------Key Data----------------Club-Name (PK1) (FK)
Agreement-Number (PK1)
--------Non-Key Data------------Agreement-Active-Date
Agreement-Expiration-Date
Obligation-Period
Required-Number-of-Credits
Bonus-Credits-After-Obligation
Database Design
MEMBER (1NF)
---------------------Key Data---------------------Member-Number (PK1)
------------------Non-Key Data------------------Member-Name
Member-Status
Member-Street-Address
Member-Daytime-Phone-Number
Date-of-Last-Order
Member-Balance-Due
Member-Bonus-Balance-Available
Member-Credit-Card-Information
MEMBER (unnormalized)
---------------------Key Data---------------------Member-Number (PK1)
------------------Non-Key Data------------------Member-Name
Member-Status
Member-Address
Member-Daytime-Phone-Number
Date-of-Last-Order
Member-Balance-Due
Member-Bonus-Balance-Available
Member-Credit-Card-Information
1 { Club-Name } n
1 { Agreement-Number } n
1 { Taste Code } n
1 { Media Preference } n
1 { Date-Enrolled } n
1 { Expiration-Date } n
1 { Number-of-Credits-Required } n
1 { Number of Credits-Earned } n
enrolls in
CLUB MEMBERSHIP (1NF)
-------------Key Data-------------Member-Number (PK1) (FK)
Club-Name (PK1) (FK)
Agreement-Number (PK1) (FK)
---------Non-Key Data----------Taste Code
Media Preference
Date-Enrolled
Expiration-Date
Number-of-Credits-Required
Number of Credits-Earned
CORRECTION
binds
AGREEMENT (1NF)
----------Key Data----------------Club-Name (PK1) (FK)
Agreement-Number (PK1)
--------Non-Key Data------------Agreement-Active-Date
Agreement-Expiration-Date
Obligation-Period
Required-Number-of-Credits
Bonus-Credits-After-Obligation
establishes
CLUB (1NF)
------------------Key Data---------------------Club-Name (PK)
--------------Non-Key Data-------------------Club-Description
Club-Charter-Date
42
sponsors
Database Design
Data Analysis for Database Design

Normalization Example

Second Normal Form:
 The next step of data analysis is to place the entities into 2NF.
• It is assumed that you have already placed all entities into 1NF.
• 2NF looks for an anomaly called a partial dependency, meaning an
attribute(s) whose value is determined by only part of the primary
key.
• Entities that have a single attribute primary key are already in 2NF.
• Only those entities that have a concatenated key need to be
checked.
43
Database Design
MEMBER ORDERED PRODUCT (1NF)
---------------Key Data-----------------Member-Number (PK1) (FK)
Product-Number (PK1) (FK)
-------------Non-Key Data------------Ordered-Product-Description
Ordered-Product-Title
Quantity-Ordered
Purchased-Unit-Price
Extended-Price
CORRECTION
MEMBER ORDERED PRODUCT (2NF)
---------------Key Data-----------------Member-Number (PK1) (FK)
Product-Number (PK1) (FK)
-------------Non-Key Data------------Quantity-Ordered
Purchased-Unit-Price
Extended-Price
sold as
PRODUCT (2NF)
------------Key Data---------------Product-Number (PK1)
Universal-Product-Code (PK2)
--------Non-Key Data------------Quantity-in-Stock
Product-Type
Suggested-Retail-Price
Club-Default-Unit-Price
Current-Special-Unit-Price
Current-Month-Units-Sold
Current-Year-Units-Sold
Total-Lifetime-Units-Sold
is a
MERCHANDISE (2NF)
-------------Key Data--------------Product-Number (PK1)
Universal-Product-Code (PK1)
---------Non-Key Data-----------Merchandise-Name
Merchandise-Description
Merchandise-Size
Merchasnise-Color
Unit-of-Measure
44
TITLE (2NF)
--------------Key Data-------------Product-Number (PK1)
Universal-Product-Code (PK2)
----------Non-Key Data----------Title-of-Work
Title-Cover
Catalog-Description
Copyright-Date
Entertainment-Category
Credit-Value
Database Design
Data Analysis for Database Design

Normalization Example

Third Normal Form:
 Entities are assumed to be in 2NF before beginning 3NF
analysis.
 Third normal form analysis looks for two types of problems,
derived data and transitive dependencies.
• In both cases, the fundamental error is that non key attributes are
dependent on other non key attributes.
• Derived attributes are those whose values can either be calculated
from other attributes, or derived through logic from the values of
other attributes.
• A transitive dependency exists when a non-key attribute is
dependent on another non-key attribute (other than by derivation).
• Transitive analysis is only performed on those entities that do not
have a concatenated key.
45
Database Design
Data Analysis for Database Design

Normalization Example

Third Normal Form:
 Third normal form analysis looks for two types of problems,
derived data and transitive dependencies. (continued)
• A transitive dependency exists when a non-key attribute is
dependent on another non-key attribute (other than by derivation).
– This error usually indicates that an undiscovered entity is still
embedded within the problem entity.
• Transitive analysis is only performed on those entities that do not
have a concatenated key.

“An entity is said to be in third normal form if every nonprimary key attribute is dependent on the primary key, the
whole primary key, and nothing but the primary key.”
46
Database Design
MEMBER ORDERED PRODUCT (2NF)
---------------Key Data-----------------Member-Number (PK1) (FK)
Product-Number (PK1) (FK)
-------------Non-Key Data------------Quantity-Ordered
Purchased-Unit-Price
Extended-Price
CORRECTION
47
MEMBER ORDERED PRODUCT (3NF)
---------------Key Data-----------------Member-Number (PK1) (FK)
Product-Number (PK1) (FK)
-------------Non-Key Data------------Quantity-Ordered
Purchased-Unit-Price
Extended-Price
Database Design
MEMBER (3NF)
---------------------Key Data---------------------Member-Number (PK1)
------------------Non-Key Data------------------Member-Name
Member-Status
Member-Street-Address
Member-Daytime-Phone-Number
Date-of-Last-Order
Member-Balance-Due
Member-Bonus-Balance-Available
Member-Credit-Card-Information
placed
MEMBER ORDER (2NF)
------------------Key Data--------------------Order-Number (PK)
----------------Non-Key Data----------------Order-Creation-Date
Order-Automatic-Fill-Date
Member Number (FK1)
Member-Name
Member-Address
Shipping-Address
Shipping Instructions
Club-Name (FK2)
Order-Sub-Total-Cost
Order-Sales-Tax
Ship-Via-Method
Shipping-Charge
Order-Status
Prepaid-Amount
CORRECTION
48
MEMBER ORDER (3NF)
------------------Key Data--------------------Order-Number (PK)
----------------Non-Key Data----------------Order-Creation-Date
Order-Automatic-Fill-Date
Member Number (FK1)
Member-Name
Member-Address
Shipping-Address
Shipping Instructions
Club-Name (FK2)
Order-Sub-Total-Cost
Order-Sales-Tax
Ship-Via-Method
Shipping-Charge
Order-Status
Prepaid-Amount
Database Design
Data Analysis for Database Design

Normalization Example

Simplification by Inspection:
 When several analysts work on a common application, it is not
unusual to create problems that won’t be taken care of by
normalization.
• These problems are best solved through simplification by
inspection, a process wherein a data entity in 3NF is further
simplified by such efforts as addressing subtle data redundancy.
49
Database Design
Data Analysis for Database Design

Normalization Example

CASE Support for Normalization:
 Most CASE tools can only normalize to first normal form.
• They accomplish this in one of two ways.
– They look for many-to-many relationships and resolve those
relationships into associative entities.
– They look for attributes specifically described as having
multiple values for a single entity instance.

It is exceedingly difficult for a CASE tool to identify second
and third normal form errors.
• That would require the CASE tool to have the intelligence to
recognize partial and transitive dependencies.
50
Database Design
File Design

Introduction.




Most fundamental entities from the data model would be designed
as master or transaction records.
 The master files a typically fixed length records.
Associative entities from the data model are typically joined into
the transaction records to form variable length records (based on
the one-to-many relationships).
Other types of files (not represented in the data model) are added
as necessary.
Two important considerations of file design are file access and
organization.
 The systems analyst usually studies how each program will
access the records in the file (‘sequentially’ or ‘randomly’), and
then select an appropriate file organization.
51
Database Design
Database Design

Introduction


The design of any database will usually involve the DBA and
database staff.
 They will handle the technical details and cross-application
issues.
It is useful for the systems analyst to understand the basic design
principles for relational databases.
52
Database Design
Database Design

Goals and Prerequisites to Database Design

The goals of database design are as follows:
 A database should provide for the efficient storage, update, and
retrieval of data.
 A database should be reliable – the stored data should have high
integrity to promote user trust in that data.
 A database should be adaptable and scaleable to new and
unforeseen requirements and applications.
53
Database Design
Database Design

Goals and Prerequisites to Database Design


The data model may have to be divided into multiple data models
to reflect database distribution and database replication decisions.
 Data distribution refers to the distribution of either specific
tables, records, and/or fields to different physical databases.
 Data replication refers to the duplication of specific tables,
records, and/or fields to multiple physical databases.
Each sub-model or view should reflect the data to be stored on a
single server.
54
Database Design
Database Design

The Database Schema

The design of a database is depicted as a special model called a
database schema.
 A database schema is the physical model or blueprint for a
database. It represents the technical implementation of the
logical data model.
 A relational database schema defines the database structure in
terms of tables, keys, indexes, and integrity rules.
 A database schema specifies details based on the capabilities,
terminology, and constraints of the chosen database
management system.
55
Database Design
Database Design

The Database Schema

Transforming the logical data model into a physical relational
database schema rules and guidelines:
1 Each fundamental, associative, and weak entity is implemented
as a separate table.
• The primary key is identified as such and implemented as an index
into the table.
• Each secondary key is implemented as its own index into the table.
• Each foreign key will be implemented as such.
• Attributes will be implemented with fields.
– These fields correspond to columns in the table.
56
Database Design
Database Design

The Database Schema

Transforming the logical data model into a physical relational
database schema rules and guidelines: (continued)
• The following technical details must usually be specified for each
attribute.
– Data type. Each DBMS supports different data types, and terms for
those data types.
– Size of the Field. Different DBMSs express precision of real numbers
differently.
– NULL or NOT NULL. Must the field have a value before the record
can be committed to storage?
– Domains. Many DBMSs can automatically edit data to ensure that
fields contain legal data.
– Default. Many DBMSs allow a default value to be automatically set in
the event that a user or programmer submits a record without a value.
57
Database Design
Database Design

The Database Schema

Transforming the logical data model into a physical relational
database schema rules and guidelines: (continued)
2 Supertype/subtype entities present additional options as
follows:
• Most CASE tools do not currently support object-like constructs
such as supertypes and subtypes.
• Most CASE tools default to creating a separate table for each
entity supertype and subtype.
• If the subtypes are of similar size and data content, a database
administrator may elect to collapse the subtypes into the supertype
to create a single table.
3
Evaluate and specify referential integrity constraints.
58
Database Design
Database Design

Data and Referential Integrity


There are at least three types of data integrity that must be
designed into any database - key integrity, domain integrity and
referential integrity.
Key Integrity:
 Every table should have a primary key (which may be
concatenated).
• The primary key must be controlled such that no two records in the
table have the same primary key value.
• The primary key for a record must never be allowed to have a
NULL value.
59
Database Design
Database Design

Data and Referential Integrity


Domain Integrity:
 Appropriate controls must be designed to ensure that no field
takes on a value that is outside of the range of legal values.
Referential Integrity:
 A referential integrity error exists when a foreign key value in
one table has no matching primary key value in the related
table.
60
Database Design
Database Design

Data and Referential Integrity

Referential Integrity:
 Referential integrity is specified in the form of deletion rules as
follows:
• No restriction.
– Any record in the table may be deleted without regard to any
records in any other tables.
• Delete:Cascade.
– A deletion of a record in the table must be automatically
followed by the deletion of matching records in a related table.
• Delete:Restrict.
– A deletion of a record in the table must be disallowed until any
matching records are deleted from a related table.
61
Database Design
Database Design

Data and Referential Integrity

Referential Integrity:
 Referential integrity is specified in the form of deletion rules as
follows: (continued)
• Delete:Set Null.
– A deletion of a record in the table must be automatically
followed by setting any matching keys in a related table to the
value NULL.
62
Database Design
Database Design

Roles



Some database shops insist that no two fields have exactly the
same name.
 This presents an obvious problem with foreign keys
A role name is an alternate name for a foreign key that clearly
distinguishes the purpose that the foreign key serves in the table.
The decision to require role names or not is usually established by
the data or database administrator.
63
Database Design
Database Design

Database Prototypes



Prototyping is not an alternative to carefully thought out database
schemas.
On the other hand, once the schema is completed, a prototype
database can usually be generated very quickly.
Most modern DBMSs include powerful, menu-driven database
generators that automatically create a DDL and generate a
prototype database from that DDL.
 A database can then be loaded with test data that will prove
useful for prototyping and testing outputs, inputs, screens, and
other systems components.
64
Database Design
Database Design

Database Capacity Planning


A database is stored on disk.
 The database administrator will want an estimate of disk
capacity for the new database to ensure that sufficient disk
space is available.
Database capacity planning can be calculated with simple
arithmetic as follows.
1 For each table, sum the field sizes.
• This is the record size for the table.
2
For each table, multiply the record size times the number of
entity instances to be included in the table.
• This is the table size.
65
Database Design
Database Design

Database Capacity Planning

Database capacity planning can be calculated with simple
arithmetic as follows. (continued)
3 Sum the table sizes.
• This is the database size.
4
Optionally, add a slack capacity buffer (e.g., 10%) to account
for unanticipated factors or inaccurate estimates above.
• This is the anticipated database capacity.
66
Database Design
Database Design

Database Structure Generation

CASE tools are frequently capable of generating SQL code for the
database directly from a CASE-based database schema.
 This code can be exported to the DBMS for compilation.
 Even a small database model can require 50 pages or more of
SQL data definition language code to create the tables, indexes,
keys, fields, and triggers.
 Clearly, a CASE tool’s ability to automatically generate
syntactically correct code is an enormous productivity
advantage.
 Furthermore, it almost always proves easier to modify the
database schema and re-generate the code, than to maintain the
code directly.
67
Database Design
The Next Generation of Database Design

Introduction


Relational database technology is widely deployed and used in
contemporary information system shops.
One new technology is slowly emerging that could ultimately
change the landscape dramatically – object database management
systems.
 The heir apparent to relational DBMSs, object database
management systems store true objects, that is, encapsulated
data and all of the processes that can act on that data.
 Because relational database management systems are so widely
used, we don’t expect this change to happen quickly.
• It is expected that these vendors will either build object technology
into their existing relational DBMSs, or they will create new,
object DBMSs and provide for the transition between relational
and object models.
68
Database Design
Summary







Introduction
Conventional Files Versus the Database
Database Concepts for the Systems Analyst
Data Analysis for Database Design
File Design
Database Design
The Next Generation of Database Design
69