Relational Database Design - .::Welcome To MES KEVEEYAM

Download Report

Transcript Relational Database Design - .::Welcome To MES KEVEEYAM

Relational Database Design
Bill Woolfolk
Public Health Sciences
University of Virginia
[email protected]
Objectives
Understand definition of modern
relational database
 Understand and be able to apply a
practical method for designing databases
 Recognize and avoid common pitfalls of
database design

What’s a database?

A collection of logically-related information
stored in a consistent fashion
◦
◦
◦
◦

Phone book
Bank records (checking statements, etc)
Library card catalog
Soccer team roster
The storage format typically appears to users as
some kind of tabular list (table, spreadsheet)
What Does a Database Do?
Stores information in a highly organized manner
 Manipulates information in various ways, some
of which are not available in other applications
or are easier to accomplish with a database
 Models some real world process or activity
through electronic means

◦ Often called modeling a business process
◦ Often replicates the process only in appearance or
end result
Databases and the Systems which
manage them
Modern electronic databases are created
and managed through means of RDBMS:
Relational DataBase Management Systems
 An individual data storage structure
created with an RDBMS is typically called
a “database”
 A database and its attendant views,
reports, and procedures is called an
“application”

Database Applications
Database (the actual DB with its
attendant storage structure)
 SQL Engine - interprets between the
database and the interface/application
 Interface or application – the part the
user gets to see and use

Relational Database
Management Systems

Low-end, proprietary, specific purpose
◦ Email: Outlook, Eudora, Mulberry
◦ Bibliographic: Ref. Mgr., EndNote, ProCite

Mid-level
◦ Microsoft Access, Lotus Approach, Borland’s Paradox
◦ More or less total control of design allows custom
builds

High-end
◦ Oracle, Microsoft SQL Server, Sybase, IBM DB2
◦ Professional level DBs: Banks, e-commerce, secure
◦ Amazon.com, Ebay.com,Yahoo.com
Problems with Bad Design
Early computers were slow and had
limited storage capacity
 Redundant or repeating data slowed
operations and took up too much
precious storage space
 Poor design increased chance of data
errors, lost or orphaned information

Benefits of Good Design
Computers today are faster and possess much
larger storage devices
 Rigid structure of modern relational databases
helped codify problems and solutions
 Design problems are still possible, because the
DBMS software won’t protect you from poor
practices
 Good design still increases efficiency of data
processes, reduces waste of storage, and helps
eliminate data entry errors

The Design Process
1)
2)
3)
4)
5)
6)
7)
8)
Identify the purpose of the database
Review existing data
Make a preliminary list of fields
Make a preliminary list of tables and enter
fields
Identify the key fields
Draft the table relationships
Enter sample data and normalize the
data/tables
Review and finalize the design
Database Modeling
Refers to various, more-or-less formal
methods for designing a database
 Some provide precision steps and tools

◦ Ex.: Entity-Relationship (E-R) Modeling
 Widely used, especially by high-end database
designers who can’t afford to miss things
 Fairly complex process
 Extremely precise
1. Identify purpose of the DB
Clients can tell you what information they
want but have no idea what data they need.
“We need to keep track of inventory”
 “We need an order entry system”
 “I need monthly sales reports”
 “We need to provide our product catalog on the
Web”

Be sure to Limit the Scope of the database.
2. Review Existing Data

Electronic
◦ Legacy database(s)
◦ Spreadsheets
◦ Web forms

Manual
◦ Paper forms
◦ Receipts and other printed output
3. Make Preliminary Field List

Make sure fields exist to support needs
◦ Ex. if client wants monthly sales reports, you need a
date field for orders.
◦ Ex. To group employees by division, you need a
division identifier

Make sure values are atomic
◦ Ex. First and Last names stored separately
◦ Ex. Addresses broken down to Street, City, State, etc.

Do not store values that can be calculated from
other values
◦ Ex. “Age” can be calculated from “Date of Birth”
4. Make Preliminary Tables
(and insert the fields into them)
Each table holds info about one subject
 Don’t worry about the quantity of tables
 Look for logical groupings of information
 Use a consistent naming convention

Naming Conventions

Rules of thumb
◦
◦
◦
◦
◦
◦
◦
◦
Table names must be unique in DB; should be plural
Field names must be unique in the table(s)
Clearly identify table subject or field data
Be as brief as possible
Avoid abbreviations and acronyms
Use less than 30 characters,
Use letters, numbers, underscores (_)
Do not use spaces or other special characters
Naming Conventions (cont’d)

Leszynski Naming Convention (LNC)
◦ Example: tblEmployees, qryPartNum
◦ tbl, qry = tag
◦ Employees, PartNum = basename

LNC at Microsoft Developers Network
5. Identify the Key Fields

Primary Key(s)
◦
◦
◦
◦

Can never be Null; must hold unique values
Automatically indexed in most RDBMSs
Values rarely (if ever) change
Try to include as few fields as possible
Multi-field Primary Key
◦ Combination of two or more fields that uniquely
identify an individual record

Candidate Key
◦ Field or fields that qualify as a primary key
◦ Important in Third and Boyce-Codd Normal Forms
6. Identify Table Relationships
 Based
on business rules being
modeled
 Examples:
◦ “each customer can place many orders”
◦ “all employees belong to a department”
◦ “each TA is assigned to one course”
Relationship Terminology

Relationship Type
◦ One-to-one: expressed as 1:1
◦ One-to-Many: expressed as 1:N or 1:M or 1:∞
◦ Many-to-Many: expressed as N:N or M:M

Primary or Parent Table
◦ Table on the left side of 1:N relationship

Related or Child Table
◦ Table on the right side of 1:N relationship

Relational Schema
◦ Diagram of table relationships in database
Relationship Terminology (cont’d)

Join
◦ Definition of how related records are returned

Join Line
◦ Visual relationship indicators in schema

Key fields
◦ Primary Key: the linking field on the one side of a 1:N
relationship
◦ Foreign Key: the primary key from one table that is
added to another table so the records can be related
◦ Non-Key Fields: any field that is not part of a primary
key, multi-field primary key, or foreign key
One-to-One (1:1)
Each record in Table A relates to one, and
only one, record in Table B, and vice versa.
 Either table can be considered the
Primary, or Parent Table
 Can usually be combined into one table,
although may not be most efficient design

One-to-Many (1:N)
Each record in Table A may relate to zero, one
or many records in Table B, but each record in
Table B relates to only one record in Table A.
 The potential relationship is what’s important:
there might be no related records, or only one,
but there could be many.
 The table on the One (or left) side of a 1:N
relationship is considered the Primary Table.

Many-to-Many (N:N)
A record in Table A can relate to many records
in Table B, and a record in Table B can relate to
many records in Table A.
 Most RDBMSs do not support N:N
relationships, requiring the use of a linking (or
intersection or bridge) table that breaks the
N:N relationship down into two 1:N
relationships with the linking table being on the
Many side of both new relationships.

Relational Schema
Table 1
Field1_1
Field1_2
Field1_3
Field1_4
1
Table 2
N Field2_1
Field1_1
Field2_2
Field2_3
14-3 DATABASE ARCHITECTURE
The American National Standards Institute/Standards
Planning and Requirements Committee (ANSI/SPARC)
has established a three-level architecture for a DBMS:
internal, conceptual and external (Figure 14.2).
14.26
Figure 14.2 Database architecture
14.27
Internal level
The internal level determines where data is actually stored on
the storage devices. This level deals with low-level access
methods and how bytes are transferred to and from storage
devices. In other words, the internal level interacts directly
with the hardware.
Conceptual level
The conceptual level defines the logical view of the data.
The data model is defined on this level, and the main
functions of the DBMS, such as queries, are also on this
level. The DBMS changes the internal view of data to the
external view that users need to see. The conceptual level is
an intermediary and frees users from dealing with the
internal level.
14.28
External level
The external level interacts directly with the user (end users
or application programs). It changes the data coming from
the conceptual level to a format and view that is familiar to
the users.
14.29
14-4 DATABASE MODELS
A database model defines the logical design of data. The
model also describes the relationships between different
parts of the data. In the history of database design, three
models have been in use: the hierarchical model, the
network model and the relational model.
14.30