Principles of Database Design

Download Report

Transcript Principles of Database Design

Principles of
Database Design,
Conclusions
MBAA 609
R. Nakatsu
Principles of Database Design
Normalization is a process that assigns
attributes (fields) to tables such that data
redundancies are eliminated or reduced.
Four Rules of Thumb:
1. Single Themes: Break up a large table into
separate themes, or subjects.
2. Field Uniqueness: Each field in a table
should represent a unique type of information
(e.g., break up complex fields, and eliminate
repeating groups).
Principles of Normalization,
Continued
3. Primary Keys: Each table must have a
unique identifier, or primary key, that is made
up of one or more fields in the table.
4. Field Independence: You must be able to
make a change to the data in any field (other
than a field in the primary key) without
affecting the data in any other field.
Rule 4: Field Independence
Employee (Employee Number, Last Name,
First Name, Job Class, Hourly Rate)
Employee Number
Last Name
First Name
Job Class
Hourly Rate
11
Smith
John
Mechanic
20
12
Jones
Susan
Technician
18
13
McKay
Bob
Mechanic
20
14
Owens
Paula
Clerk
15
15
Chang
Steve
Mechanic
20
16
Sarandon
Sarah
Mechanic
20
In this example, Hourly Rate is dependent on
Job Class.
Solution: Create Two Tables
Employee (Employee Number, Last Name,
First Name, Job Class ID)
Employee Number
Last Name
First Name
Job Class ID
11
Smith
John
2
12
Jones
Susan
3
13
McKay
Bob
2
14
Owens
Paula
1
15
Chang
Steve
2
16
Sarandon
Sarah
2
Job Class ID is the link to the Job Class table.
Job Class (Job Class ID, Job Class,
Hourly Rate)
Job Class ID
1
2
3
Job Class
Clerk
Mechanic
Technician
Hourly Rate
15
20
18
There are no more field dependencies!
Integrity of a Database
It is important that the data in a database is
correct and consistent.
Data integrity problems can occur due to:
 incorrect data entry
 data redundancy
 poor database design
Ways to Maintain Data Integrity







Entity Integrity (primary key must be unique
and must not be null)
Referential Integrity (see previous slide)
Field types (e.g., text, numeric, date, yes/no)
Input masks (e.g., ISBN number, dates)
Field validation rules (What are some
examples?)
Good design of input screens (user interfaces)
Check digits
Check Digit Example
 Add a check digit to validate a clerk’s data
entry.
 The check digit is determined by some
mathematical algorithm.
Example:
Multiply the first digit by 2; multiply the second
digit by 3; multiply the third by 4; add the
results; divide by 10 and take the remainder.
127 yields (1*2 + 2*3 + 7*4) / 10 = 36 / 10 for a
check digit of 6.
Concurrency Control
Concurrency control is the management of
concurrent transaction execution.
Why is it important?
The simultaneous execution of transactions over
a shared database may create several dataintegrity and consistency problems.
Lost Updates:
An Example
Time
1
2
3
4
5
6
Transaction
Step
Stored Value
T1
Read QOH
50
T2
Read QOH
50
T1
QOH = 50 + 20
T2
QOH = 50 - 30
T1
Write QOH
70
T2
Write QOH
20
Note that the first transaction has not been
permanently recorded when the second
transaction is executed.
Concurrency Control
with Locking
A lock guarantees exclusive use of a data item
to a current transaction.
In the previous example, transaction T2 will not
have access to the data item that is currently
used by transaction T1.
The lock is released when the transaction T1 is
completed.
Data Warehouse
stores data that have been extracted from the
various operational, external, and other
databases of an organization
Data warehouses...


support OLAP (online analytical processing)
support data mining (data in a data warehouse are analyzed
to reveal hidden patterns and trends in historical business
activity)
Data Warehouse
Recap









Why do we need a database?
What is a database?
What are the main functions of a DBMS?
Entity-Relationship Modeling
Linking two tables: 1:1, 1:M, and M:N
Four principles of database design
Data Integrity
Concurrency Control
Data Warehouses