Relational Database Model

Download Report

Transcript Relational Database Model

Slides 4
The Relational Database Model:
The Relational Database Model
• Based on the theory of relational math (set theory)
• It is an “automatic transmission” database (with
embedded relationships between tables) which
replaces the “standard transmission” database
(which employs flat-file techniques with explicit
pointers between files and records)
• Flat-files (collections of similar records) are being
replaced by collections of interrelated files
• Allows data to be broken down into logical,
smaller, more manageable units - simplifies the
organization of complex sets of data
Why A Relational Model?
• Duplicate data reduced - less input, maintenance,
storage, and improved data integrity
• Data independence: Data can be thought of as being
stored in tables regardless of how physically stored.
• Application independence: Databases defined
independently from the systems and programs that will
use them - allows users to create ad hoc queries, rather
than only receive pre-specified reports
• A change in the database does not require rewriting all
the application program codes. Ability to share same data
across multiple applications and systems.
• It has the ability to maintain several tables of related
information that can be accessed by several different
users in many different ways - a single query can retrieve
Some Definitions...
Data: Raw facts about the organization and its business
transactions that are of interest to the end user
Database: A computer structure that houses a collection of
data
Relational database: Stores information about instances of
entities (a specific sales event, salesperson), attributes of
those entities (invoice no., salesperson ID) , and the
relationships among these entities (each sale can only have
one salesperson) - perceived by user to be a collection of
two-dimensional tables
RDBMS: Software that manages a relational database,
controls access, and allows users to retrieve requested data
through a standard data-access language, SQL.
Entity-type: Something of significance about which you
want to store data in a database, e.g., customers,
employees, suppliers, inventory items (note: this is a
data modeling term – an entity becomes a table in a
RDBMS)
Table: An entity-type (e.g., customer) and its attributes
Attribute: A property or characteristic of an entity. A
column in a relational database table, e.g., customer
name, reference #, address, zip ((note: this is a data
modeling term – an attribute becomes a column in a
RDBMS
Row (tuple, record): A record of data in a database table a single occurrence or entity instance
Value: Data in a “cell” – the intersection between row and
Types of Attributes
Key (identifier in data modeling): Attribute, or combination of
attributes, that determines the values of other attributes in each row
Composite Key: Multiple-attribute keys; may be further subdivided,
e.g., phone may be area code and number - can be a primary key
Candidate Key (CK): Attribute (or a minimum combination of
attributes) that uniquely identifies each row in a given table - there
can be more than one CK (employee entity type: SSN; assigned
ID#)
Primary Key (PK) ( a unique identifier in data modeling): A CK
selected to uniquely identify all other attributes in a given row;
cannot be Null
Foreign Key (FK): ( a relationship in data modeling): Attribute
(combination of attributes) whose value(s) must match the Primary
Key in another table in the same database, or whose value(s) must
be Null
Non-key Attribute: Attribute that is not part of a key
Attributes With A Null Value
• Null Value: An unknown attribute value (e.g.,
salesperson not yet allocated to a customer) - it is
not a zero. It is an optional attribute.
• Inclusion of nulls in a table is important - they
provide a consistent way to distinguish between
valid data such as a 0 and missing data, e.g., an
account payable with 0 is good to see; one with an
unknown balance can indicate a significant problem
• In most cases, nulls appear as blanks on a query’s
result table on a screen
Relationships
• Data modeling term that indicates an association
between tables: How the things of significance are
related (A FK must match to an existing PK, or
else be NULL)
• This controlled redundancy allows linking of
tables (hence “relational”)
• Entity-Relationship Diagram (ERD): A data
model (at the conceptual level) that shows the
relationships enforcing business rules between
entities (tables) in a database environment (Fig.
5.4)
Business Rules
• Narrative descriptions of policies,
procedures, or principles in an organization
• Examples:
– A pilot cannot be on duty for more than 10
hours in a 24-hour period
– A professor must teach at least three classes in a
semester
– A class may not have fewer than 10 enrollments
Normalization
• Process of taking a “raw” database and breaking it
into logical units called tables, by following
theoretical rules called normal forms
• The intent is to create a degree of controlled
redundancy that allows two or more tables to be
joined, by matching a FK in one table to a PK in
another table
• Referential integrity (constraint created upon table
creation) is enforced when every non-null FK
value must match an existing PK value (if there is
a FK, there has to be a PK for that FK in another
table)
• Normalization has six nested normal forms
• Generally a well-formed business database will be
normalized through 3rd normal form (3NF)
Benefits of Normalization
•
•
•
•
•
•
Greater overall database organization
Minimize data redundancies
Data consistency within the database
A more flexible database design
Data can be used more productively
A better handle on database security
Disadvantage of Normalization
• Reduced database performance because database
must locate requested tables and join data - requires
additional processing logic
Normal Forms
—Normalization through a series of
stages called NORMAL FORMS
–Each NF depends on normalization
steps taken in the previous NF
• First Normal Form - 1NF
• Second Normal Form - 2NF
• Third Normal Form - 3NF
1NF
First normal form rules:
•All key attributes must be defined
•There must be no repeating groups (values), i.e.,
each row/column intersection can have only one
value
•All attributes must be functionally dependent on
the PK, or part of the PK - e.g., SSN determines
DOB, but DOB cannot determine SSN
Hint: Put all attributes in a two-dimensional flat
table, with no repeating values
General Journal Entry:
Traditional View - Unnormalized
Table 1
Traditional General Journal Entries
Tran_#
Date
Title
Acct_#
Debit
1
1/3/96
Cash
11001
100.00
Receivable
11003
200.00
Sales
41001
Credit
Explain
300.00
credit sale with down payment
2
1/3/96
COGS
Inventory
51001
11004
150.00
150.00
sale of inventory
Assume that the transaction # will reset to 1 at the beginning of the next fiscal year
GJ: First Normal Form
Table 2
General Journal Entries: 1NF
Tran_#
Date
Title
Acct_#
Debit
Credit
Explain
1
1/3/96
Cash
11001
100.00
0.00
credit sale
1
1/3/96
Receivable
11003
200.00
0.00
credit sale
1
1/3/96
Sales
41001
0.00
300.00
credit sale
2
1/3/96
COGS
51001
150.00
0.00
sale of
inventory
2
1/3/96
Inventory
11004
0.00
150.00
sale of
inventory
2NF
– Second Normal Form Rules:
• Table is in 1NF; and
• Table includes no partial dependencies;
that is, no attribute is dependent on only
portion of the primary key – must be
dependent on entire PK
Hint: Examine non-key attributes to determine
whether any are dependent on only portion of a
composite PK - this would violate 2NF
If a table only has one attribute as a PK, then it
is in 2NF.
Chart of Accounts Table
Table C
General Journal Entries: 2NF, 3NF
Acct_#
Title
11001
Cash
11003
Receivable
41001
Sales
51001
COGS
11004
Inventory
Transaction Listing Table
Table A
General Journal Entries: 2NF, 3NF
Tran_#
Date
Explain
1
1/3/96
credit sale with down payment
2
1/3/96
sale of inventory
Transaction Detail Table
(Base Table)
Table B
General Journal Entries: 2NF
Tran_#
Date
Acct_#
Debit
Credit
1
1/3/96
11001
100.00
0.00
1
1/3/96
11003
200.00
0.00
1
1/3/96
41001
0.00
300.00
2
1/3/96
51001
150.00
0.00
2
1/3/96
11004
0.00
150.00
3NF
–Third Normal Form Rules:
• Table is in 2NF and
• There are no transitive
dependencies
Hint: You will violate 3NF if you can deduce the
value of a non-key attribute by knowing the
value of another non-key attribute
Normalized
Transaction Detail (Base) Table
Table D
General Journal Entries: 3NF
Tran_#
Date
Acct_#
Amount
1
1/3/96
11001
$100.00
1
1/3/96
11003
$200.00
1
1/3/96
41001
($300.00)
2
1/3/96
51001
$150.00
2
1/3/96
11004
($150.00)