Chapter 3 Effects of IT on Strategy and Competition
Download
Report
Transcript Chapter 3 Effects of IT on Strategy and Competition
Chapter 1
Overview of Database Concepts
Jason C.H. Chen, Ph.D.
Professor of MIS
School of Business, Gonzaga University
Spokane, WA 99258 USA
[email protected]
Dr. Chen, Business Database Systems
Objectives
• Define database terms
• Identify the purpose of a database management
system (DBMS)
• Explain database design using entity-relationship
models and normalization
• Explain the purpose of a Structured Query
Language (SQL)
• Understand how this textbook’s topics are
sequenced and how the two sample databases are
used
Dr. Chen, Business Database Systems
2
What is Information ?
DATA
INFORMATION
Information is refined data.
Dr. Chen, Business Database Systems
Data vs. Information
• Users really want is
– Information
What users can learn from the data
how to satisfy their best customers
how to allocate their resources most efficiently,
how to minimize losses
Dr. Chen, Business Database Systems
4
Objectives of the MIS
Deliver the right information
to the right people,
at the right time,
with the right form.
Ultimately, MIS should
improve the workers’
productivity.
Dr. Chen, Business Database Systems
who has what
information about
whom and when,
where, and how will
all be decided in the
process of building
an information
system.
5
A Relationship Between the Organization and
the DataBase
Organizational
IS Resources:
1. Hardware
2. Software
3. Data
4. Procedures
5. People
DataBase
Management
Systems
Old
State
Activities
Picture
Or
Model
Events
New
State
Transactions
(Add, modify, delete, …)
D.B.
Correspond with the
processing of the events
Information must be:
1. Time: Timely,
2. Content: Accurate, relevant, and verifiable.
3. Form: Presented in a useable form.
User
Dr. Chen, Business Database Systems
TM 0-6
Questions
• Is MS/Access or Oracle (SQL/Server or
IBM DB2) a data base?
• Why don’t we just create a “huge” data file
containing all fields (columns)?
– Have you ever created a database (with many
applications) with only one “huge” data file
containing all fields (columns)?
Dr. Chen, Business Database Systems
7
Traditional File Processing Approach
• Traditional File Processing
– An application uses one specialized file. For example,
purchase order processing application uses data about
supplier and products, while an order-taking application uses
data about customer, products and orders.
• Problems:
– Data dependence: data structures are tightly coupled with
applications. In other words, we prefer data independence.
– Data redundancy: same data are repeatedly saved for
different applications.
– Other problems
Dr. Chen, Business Database Systems
8
Types of Data Processing
• Two types of data processing
– File-based (traditional) data processing
• applications developed by Java
– Data-based data processing
• applications developed by Oracle or MS/Access)
Dr. Chen, Business Database Systems
9
Three file processing systems at a Furniture Company
Duplicate
Data
Dr. Chen, Business Database Systems
What is the main
problem in the
company’s processing
systems?
Disadvantages of File Processing
• Program-Data Dependence
– All programs maintain metadata for each file they use
• Data Redundancy (Duplication of data)
– Different systems/programs have separate copies of the same
data
• Limited Data Sharing
– No centralized control of data
• Lengthy Development Times
– Programmers must design their own file formats
• Excessive Program Maintenance
– 80% of of information systems budget
Dr. Chen, Business Database Systems
11
Database Management System
Application
#1
Application
#2
Application
#3
DBMS
Database
containing
centralized
shared data
DBMS manages data
resources like an operating
system manages hardware
resources
What are the advantages of employing
Data-based Data Processing?
Dr. Chen, Business Database Systems
Traditional File Processing
Applications
Purchasing
System
Data Files
Product
Supplier
Name, item#,
description...
Sales
order processing
system
Dr. Chen, Business Database Systems
Name, item#,
description...
Sales rep
Product
Database Management Systems
D a tab a s e
P u rch a sing
S y s te m
D a ta b a s e
M an a g e m e n t
P ro d u ct
S y s te m s
S a le s re p
S a le s
o r d e r p r o c e s s in g
s y s te m
Dr. Chen, Business Database Systems
S u p p l ie r
Database Terminology
• Database – an organized collection of
logically related data files
• Database management system
(DBMS) – software used to create
and interact with the database
Dr. Chen, Business Database Systems
15
Database Components
• Character
– Basic unit of data
– Can be a letter, number, or special symbol
• Field
– A group of related characters
– Represents an attribute or characteristic of an entity
– Corresponds to a column in the physical database
• Record
– A collection of fields for one specific entity
– Corresponds to a row in the physical database
• File
– A group of records about the same type of entity
Dr. Chen, Business Database Systems
16
Relational Databases
• Data is organized in tables
– Columns (fields) represent different data categories
– Rows (records) contain actual data values
Dr. Chen, Business Database Systems
17
Database Example
Dr. Chen, Business Database Systems
18
Components Example
Dr. Chen, Business Database Systems
Relational Database Terms
• Entity: an object about which you want to store data
• Table, column, row
– Flat file, attribute (or field), record
• Relationships: links that show how different records are
related
• Key Fields: establish relationships among records in
different tables
• Five main types of key fields:
–
–
–
–
–
primary keys
candidate keys
surrogate keys
foreign keys
composite keys
Dr. Chen, Business Database Systems
20
Primary Keys vs. Candidate Keys
• Primary key
– Value must be unique for each record
– Serves to identify the record
– Present in every record
– Can’t be NULL
– Should be numeric
• Candidate key
– Any field that could be used as the primary key
– Should be a unique, unchanging numeric field
Dr. Chen, Business Database Systems
21
Surrogate Keys
• Surrogate key: created to be the record’s primary key
identifier when no suitable primary key exists
• Surrogate key has no real relationship to the record to
which it is assigned, other than to identify the record
uniquely
• Developers configure the database to generate
surrogate key values automatically
• In an Oracle database, you can automatically generate
surrogate key values using a sequence
• Surrogate keys are always numerical fields, because
the database generates surrogate key values
automatically by incrementing the previous value by
one
Dr. Chen, Business Database Systems
22
Foreign Keys
• Foreign key: a field in a table that is a primary key in
another table
• Foreign key creates a relationship between the two tables
• Foreign key value must exist in the table where it is a
primary key
Dr. Chen, Business Database Systems
23
Composite Key
• Composite key: a unique key that you create by
combining two or more fields
• Usually comprised of fields that are primary keys
in other tables
Composite
Key
ORDER_ID
PRODUCT_ID
ORDER_QUANTITY
100
1
2
100
2
2
200
2
2
200
1
1
Dr. Chen, Business Database Systems
24
Database Management System
• Data storage: manage the physical structure of the
database
• Security: control user access and privileges
• Multiuser access: manage concurrent data access
• Backup: enable recovery options for database failures
• Data access language: provide a language that allows
database access
• Data integrity: enable constraints or checks on data
• Data dictionary: maintain information about database
structure
Dr. Chen, Business Database Systems
25
Client/Server Database
Management Systems
• Client/server database
– Takes advantage of distributed processing and networked
computers by distributing processing across multiple
computers
– DBMS server process runs on one workstation, and the
database applications run on separate client workstations
across the network
– Preferred for database applications that retrieve and
manipulate small amounts of data from databases
containing large numbers of records because they
minimize network traffic and improve response times
• Organizations generally use a client/server database if the
database will have more than 10 simultaneous users and
if the database is mission critical
Dr. Chen, Business Database Systems
26
The Oracle11g Client/Server Database
• Oracle11g is the latest release of Oracle Corporation’s
relational database management system
• All Oracle server- and client-side programs use Oracle
Net, a utility that enables the network communication
between the client and the server
Dr. Chen, Business Database Systems
DATABASE TRENDS
Linking Internal Databases to the Web
N
Dr. Chen, Business Database Systems
28
Web-Based Client/Server Database Architecture
2. Request for
data-based Web page
1. Request for
data-based Web page
Network
8. Data-based
Web page
Web Browser
3. Data query
6. Retrieved data
7. Data-based
Web page
4. Data
query
Web server
5. Retrieved
data
Legend
Communications
Between Web browser
And Web server
Database Server
Dr. Chen, Business Database Systems
Communications
Between Web server
And database server
29
System Response Time
Depends on ...
• the speed of the network
• the size of the database
• the way the database is used
…
• a personal database running on a server
might handle 10 users making database
transactions at the same time before
becoming overloaded.
Dr. Chen, Business Database Systems
30
Advantages of Client/Server Database
Management Systems
Handling server and client failures
Processing transactions
Handling high data volumes
Providing security
Servicing multiple simultaneous users
Dr. Chen, Business Database Systems
31
SQL, SQL*Plus, and PL/SQL
Language or Tool
Description
SQL
A command language for communication with the
Oracle 10/11Server from any tool or application. Oracle
SQL contains many extensions.
An Oracle tool that recognizes and submits SQL and
PL/SQL statements to the Server for execution and
contains its own command language.
An Oracle Procedural Language for writing application
logic and manipulating data outside the database.
SQL*Plus
PL/SQL
Terminal
Buffer
SQL and
PL/SQL
Scripts
Dr. Chen, Business Database Systems
SQL*Plus
Server
Design Principles
• To avoid creating tables that contain redundant
data, group related items that describe a single
entity together in a common table
• Do not create tables that duplicate values many
times in different rows
• When creating a database and inserting data
values, you must specify the data type for each
column
• Recall that primary key fields should use a number
data type to avoid typographical, punctuation, and
case variation errors
Dr. Chen, Business Database Systems
33
Database Design
• Systems Development Life Cycle (SDLC)
• Entity-relationship model (E-R model)
• Normalization
Dr. Chen, Business Database Systems
34
Systems Development Life Cycle
Understand the
Business
Problem or
Opportunity
Develop an
Information
System
Solution
Systems Investigation
(Definition) Product:
Feasibility Study
Systems Analysis
Product:
Functional Requirements
Systems Design
Product:
System Specifications
Implement
the Information
System
Solution
Systems Implementation
Product:
Operational System
Systems Maintenance
Product:
Improved System
Dr. Chen, Business Database Systems
Systems Development Life Cycle
(SDLC)
• Systems investigation – understanding the problem
• Systems analysis – understanding the solution
• Systems design – creating the logical and physical
components
• Systems implementation – placing completed system
into operation
– including integration, testing and deployment
• Systems maintenance and review – evaluating the
implemented system
Dr. Chen, Business Database Systems
36
Data Models
• A data model is a collection of concepts for
describing data.
• Three database models
– Hierarchical
– Network
– Relational
CUSTOMER
Dr. Chen, Business Database Systems
ORDER
37
Entity-Relationship Model
(E-R Model)
• Used to depict the relationship that exists among
entities
• The following relationships can be included in
an E-R model:
– One-to-one
– One-to-many
– Many-to-many
Dr. Chen, Business Database Systems
E-R Model Notation Examples
Figure 1-3 E-R Model notation examples
Please note that the name of entity should be singular even it contains multiple
instances
Dr. Chen, Business Database Systems
39
Figure 1-5 JustLee Books’ table structures after normalization
Dr. Chen, Business Database Systems
One-to-One Relationship
• Each occurrence of data in one entity is
represented by only one occurrence of data
in the other entity
• Example: Each order has just one invoice
and each invoice is assigned to just one
order
ORDER
Dr. Chen, Business Database Systems
INVOICE
41
One-to-Many Relationship
• Each occurrence of data in one entity can be
represented by many occurrences of the
data in the other entity
• Example: A class has only one instructor,
but each instructor can teach many classes
Class
Dr. Chen, Business Database Systems
Instructor
42
Many-to-Many Relationship
• Data can have multiple occurrences in both
entities
• Example: A student can take many classes,
and each class is composed of many
students
• Can not be included in the physical database
Class
Dr. Chen, Business Database Systems
Student
43
JustLee Example E-R Model
Dr. Chen, Business Database Systems
44
Installing Personal Oracle11g
• I created a “Discussion Questions” on the Bb for you to
share (or post question) your experience on installation of
Oracle 11g. Please participate and it will be also your
part of class performance.
•
• Please note that if your computer with 64-bit operating
system such as Windows 7, you can’t use the DVD to
install Oracle since it is for 32-bit OS. Rather you need
to download 64-bit Oracle 11g from the Oracle site
(www.oracle.com) or type in keyword search of “Oracle
11g download”.
Dr. Chen, Business Database Systems
45
Q: Can we create/enter the data base on the following table
immediately?
BOOKS table
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
T.Peterson
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Austin
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Adams
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
W.White
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
L.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
M.Kane
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
S. Little
ISBN
Title
8843172113
Dr. Chen, Business Database Systems
Database Normalization
• Purposes
– design a reliable and stable data bases
– increase data integrity (reduce or control data
redundancy)
• Processes
– determines required tables and columns for
each table
– multistep process
Dr. Chen, Business Database Systems
47
Problem Solving for Modeling a Database Project
Study and Analyze
w/Team
Business Problem
???
IMPLEMENTATION
Dr. Chen, Business Database Systems
Problem Solving for Modeling a Database Project
Study and Analyze
w/Team
Business Problem
User interview &
Integrated Model
ER or other Model
Normalization
Normalization
(3NF)
IMPLEMENTATION
Dr. Chen, Business Database Systems
Well-Structured Relations
• A relation that contains minimal data redundancy
and allows users to insert, delete, and update rows
without causing data inconsistencies
• Goal is to avoid (minimize) anomalies
– Insertion Anomaly – adding new rows forces user to
create duplicate data
– Deletion Anomaly – deleting rows may cause a loss of
data that would be needed for other future rows
– Modification Anomaly – changing data in a row forces
changes to other rows because of duplication
General rule of thumb: a table should not pertain to more
than one entity type
Dr. Chen, Business Database Systems
50
Functional Dependencies and Keys
• Functional Dependency: The value of one attribute (the
determinant) determines the value of another attribute.
• Candidate Key
– A unique identifier. One of the candidate keys will become
the primary key
• E.g. perhaps there is both credit card number and SS# in
a table…in this case both are candidate keys
– Each non-key field is functionally dependent on every
candidate key
Figure 5-9
Dr. Chen, Business Database Systems
51
Database Normalization (continued)
• Data redundancy – refers to having the same data in
different places within a database
• Data anomalies – refers to data inconsistencies
– Insertion anomaly
– Deletion anomaly
– Modification anomaly
Dr. Chen, Business Database Systems
Unnormalized Data
• Contains repeating groups in the Author column
in the BOOKS table
Dr. Chen, Business Database Systems
Figure: 5-22 Steps in normalization
Table with Repeating
Group attributes
First normal
form (1NF)
Second normal
form(2NF)
Third normal
form (3NF)
Boyce-Codd normal
form (BC-NF)
Fourth normal
Form (4NF)
Fifth normal
form (5NF)
Dr. Chen, Business Database Systems
Remove Repeating
Group Attributes
Remove …
Remove …
Remove remaining
anomalies resulting from
multiple candidate keys
Remove Multivalued
Dependencies
Remove Remaining
Anomalies
First-Normal Form (1NF)
• Primary key (pk) is identified
• Repeating groups are eliminated
• Every attribute value is atomic (singledvalue)
Dr. Chen, Business Database Systems
55
First-Normal Form (1NF) (continued)
• ISBN and Author columns together create a
composite primary key
Dr. Chen, Business Database Systems
1NF: ISBN and Author fields together create a composite primary key
Q: What fields are dependent on ISBN alone ?
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
T.Peterson
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Austin
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Adams
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
W.White
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
L.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
M.Kane
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
S. Little
ISBN
Title
8843172113
Dr. Chen, Business Database Systems
Composite Primary Key
• More than one field (column) is required to
uniquely identify a record (row).
• Can lead to partial dependency – a field is
only dependent on a portion of the primary
key
Dr. Chen, Business Database Systems
58
Second-Normal Form (2NF)
• 1NF and every non-key attribute is fully
functionally dependent on the primary key.
• Partial dependency must be eliminated
– Break the composite primary key into two
parts, each part representing a separate table
– Every non-key attribute must be defined by the
entire key (either a single PK or a CK), not by
only part of the key
Dr. Chen, Business Database Systems
59
Second-Normal
Form (2NF) (continued)
• BOOKS table in 2NF
ISBN
Author
Publication
Date
Cost
Retail
Category
Publisher
Contact
8843172113
T.Peterson
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
8843172113
J.Austin
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
American
Publishing
Davidson
8843172113
J.Adams
1915762492
W.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
1915762492
L.White
6522489652
M.Kane
6522489652
60
S. Little
ISBN
Title
8843172113
Dr. Chen, Business Database Systems
1NF: ISBN and Author fields together create a composite primary key
Q: What fields are dependent on ISBN alone ?
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
T.Peterson
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Austin
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Adams
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
W.White
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
L.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
M.Kane
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
S. Little
ISBN
Title
8843172113
Dr. Chen, Business Database Systems
Fig: A Process from 1NF to 2NF
Dependency on entire primary key (ISBN & Author)
ISBN
Title
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Dependency on partial primary key (ISBN)
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
T.Peterson
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Austin
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Adams
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
W.White
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
L.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
M.Kane
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
S. Little
ISBN
Title
8843172113
Dr. Chen, Business Database Systems
Fig: A Process from1NF to 2NF
Dependency on entire primary key (ISBN & Author)
ISBN
Title
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Dependency on partial primary key (ISBN)
ISBN & Author Title, Publication Date, Cost …
ISBN Title, Publication Date, Cost …
Therefore, NOT in 2nd Normal Form!!
Dr. Chen, Business Database Systems
Fig: A Process from 1NF to 2NF
Dependency on entire primary key (ISBN & Author)
ISBN
Title
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Dependency on partial primary key (ISBN)
ISBN
Title
Publication
Cost Retail Category Publisher Contact
Date
ISBN Author
Dr. Chen, Business Database Systems
Fig: A Process from 1NF to 2NF
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
T.Peterson
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Austin
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Adams
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
W.White
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
L.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
M.Kane
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
S. Little
ISBN
Title
8843172113
ISBN
Author
Publication
Date
Cost
Retail
Category
Publisher
Contact
8843172113
T.Peterson
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
8843172113
J.Austin
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
American
Publishing
Davidson
8843172113
J.Adams
1915762492
W.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
1915762492
L.White
6522489652
M.Kane
6522489652
S. Little
ISBN
Title
8843172113
Dr. Chen, Business Database Systems
Figure 1-5 JustLee Books’ table structures after normalization
Dr. Chen, Business Database Systems
Figure: 5-22 Steps in normalization
Table with Repeating
Group attributes
Remove Repeating
Group Attributes
First normal
form (1NF)
Second normal
form(2NF)
Third normal
form (3NF)
Boyce-Codd normal
form (BC-NF)
Fourth normal
Form (4NF)
Fifth normal
form (5NF)
Dr. Chen, Business Database Systems
Remove Partial
Dependencies
Remove Transitive
Dependencies
Remove remaining
anomalies resulting from
multiple candidate keys
Remove Multivalued
Dependencies
Remove Remaining
Anomalies
Third Normal Form (3NF)
• Transitive dependency must be eliminated
– 2NF and transitive dependencies (functional
dependency between non-key attributes) must
be eliminated
Dr. Chen, Business Database Systems
68
Fig: A Process from 1NF to 2NF
Dependency on entire primary key (ISBN & Author)
ISBN
Title
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Dependency on partial primary key (ISBN)
Is it in the 3NF?
ISBN
Title
Publication
Cost Retail Category Publisher Contact
Date
ISBN Author
Dr. Chen, Business Database Systems
Fig: A Process from 2NF to 3NF
All non-key fields are dependent on pk (ISBN)
ISBN Title
Publication
Cost Retail Category Publisher Contact
Date
ISBN Title
ISBN Publication Date
…
ISBN Category
ISBN Publisher
ISBN Contact
and
Category Contact
All this is OK
(2nd NF)
Dr. Chen, Business Database Systems
a non-key field is dep. on another
non-key field
BUT
ISBN Category Contact
implies
ISBN Contact
Transitive dependency
(not in 3rd NF)
Fig: A Process from 2NF to 3NF
ISBN Title
ISBN
Title
Publication
Cost Retail Category Publisher Contact
Date
Publication
Cost Retail Category Publisher
Date
Category
Dr. Chen, Business Database Systems
Contact
Third-Normal Form (3NF)
• Publisher contact name has been removed
Publication
Date
Cost
Retail
Category
Publisher
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
American
Publishing
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
ISBN
Title
8843172113
TABLE 1-5 The BOOKS Table in 3NF
Dr. Chen, Business Database Systems
Category
Contact
Computer
Davidson
Accounting
Brown
We might further create a look up table
for “Category” (see Table 1-6, p.12)
Fig: A Process from 2NF to 3NF
Publication
Date
Cost
Retail
Category
Publisher
Contact
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
American
Publishing
Davidson
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishin
g
Brown
Publication
Date
Cost
Retail
Category
Publisher
ISBN
Title
8843172113
ISBN
Title
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
American
Publishing
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Dr. Chen, Business Database Systems
Category
Contact
Computer
Davidson
Accounting
Brown
Processes from 1NF 2NF 3NF
Dependency on entire primary key (ISBN & Author)
ISBN
Title
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Dependency on partial primary key (ISBN)
(remove partial dependency)
ISBN Title
Publication
Cost Retail Category Publisher Contact
Date
ISBN Author
(remove transitive dependency)
ISBN
Title
Publication
Cost Retail Category Publisher
Date
Q: Finally, how many tables on the 3NF?
Dr. Chen, Business Database Systems
Category
Contact
Q: How many tables on the 3NF?
ISBN
Title
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Answer: Three tables. What are they?
BOOK_AUTHOR
BOOKS
ISBN
Title
Publication
Cost Retail Category Publisher
Date
Note that You can’t do any
implementation until you transform
a table into 3NF (with more tables
produced)
Dr. Chen, Business Database Systems
ISBN Author
CATEGORY_CONTACT
Category
Contact
Processes from 1NF 2NF 3NF
Publication
Date
Cost
Retail
Category
Publisher
Contact
Author
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
T.Peterson
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Austin
8843172113
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
Davidson
J.Adams
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
W.White
1915762492
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
Amercian
Publishing
Davidson
L.White
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
M.Kane
6522489652
Meaningful
Accounting
25-MAY-08
41.60
85.00
Accounting
Articulate
Publishing
Brown
S. Little
ISBN
Title
8843172113
Publication
Date
Cost
Retail
Category
Publisher
ISBN
Author
Database
Implementation
04-JUN-03
31.40
55.95
Computer
American
Publishing
8843172113
T.Peterson
8843172113
J.Austin
Handcranked
Computers
21-JUN-05
21.80
25.00
Computer
American
Publishing
8843172113
J.Adams
Meaningful
Accounting
25-MAY-08
1915762492
W.White
1915762492
L.White
6522489652
M.Kane
6522489652
S. Little
ISBN
Title
8843172113
1915762492
6522489652
41.60
85.00
Accounting
Articulate
Publishing
fk(foreign key)
Category Code
Category
Category
Contact
10
Computer
Computer
Davidson
50
Accounting
Accounting
Brown
Dr. Chen, Business Database Systems
We will learn all detailed processes in the main text (ch.4)
Figure: 5-22 Steps in normalization
Table with Repeating
Group attributes
Remove Repeating
Group Attributes
First normal
form (1NF)
Second normal
form(2NF)
Third normal
form (3NF)
Boyce-Codd normal
form (BC-NF)
Fourth normal
Form (4NF)
Fifth normal
form (5NF)
Dr. Chen, Business Database Systems
Remove Partial
Dependencies
Remove Transitive
Dependencies
Remove remaining
anomalies resulting from
multiple candidate keys
Remove Multivalued
Dependencies
Remove Remaining
Anomalies
Summary of Normalization Steps
• 1NF: eliminate repeating groups, identify
the primary key
• 2NF: table is in 1NF, and partial
dependencies are eliminated
• 3NF: table is in 2NF, and transitive
dependencies are eliminated
Dr. Chen, Business Database Systems
78
Figure: 5-22 Steps in normalization
Table with Repeating
Group attributes
Remove Repeating
Group Attributes
First normal
form (1NF)
Second normal
form(2NF)
Third normal
form (3NF)
Boyce-Codd normal
form (BC-NF)
Fourth normal
Form (4NF)
Fifth normal
form (5NF)
Dr. Chen, Business Database Systems
Remove Partial
Dependencies
Remove Transitive
Dependencies
Remove remaining
anomalies resulting from
multiple candidate keys
Remove Multivalued
Dependencies
Remove Remaining
Anomalies
Relating Tables within the Database
• Once tables are normalized, make certain
tables are linked
• Tables are linked through a common field
• A common field is usually a primary key in
one table and a foreign key in the other
table
Dr. Chen, Business Database Systems
80
Category_Contact
Category
Contact
Dr. Chen, Business Database Systems
Lookup Table
• Common reference for descriptive data
tables referenced in a foreign key
fk(foreign key)
Dr. Chen, Business Database Systems
Category Code
Category
10
Computer
50
Accounting
82
Structured Query Language (SQL)
• Data sublanguage
• Used to:
–
–
–
–
Create or modify tables
Add data to tables
Edit data in tables
Retrieve data from tables
• ANSI and ISO standards
Dr. Chen, Business Database Systems
83
Databases Used in this Textbook –
JustLee Books Database
• Assumptions
– No back orders or partial shipments
– Only U.S. addresses
– Completed orders are transferred to the annual
SALES table at the end of each month to enable
faster processing on the ORDERS table
Dr. Chen, Business Database Systems
84
Topic Sequence
• The first half of the text will focus on
creating a database
• The second half of the text will focus on
querying or retrieving data from a database
Dr. Chen, Business Database Systems
85
Summary
• A DBMS is used to create and maintain a database
• A database is composed of a group of interrelated tables
• A file is a group of related records; a file is also called a table in the
physical database
• A record is a group of related fields regarding one specific entity; a
record is also called a row
• A common field is used to join data contained in different tables
• A primary key is used to uniquely identify each record
• A foreign key is a common field that exists between two tables but
is also a primary key in one of the tables
• A lookup table is a common term for a table referenced in a foreign
key
• A Structured Query Language (SQL) is a data sublanguage that
navigates the data stored within a database’s tables
Dr. Chen, Business Database Systems
86
Summary (continued)
• A record is considered unnormalized if it contains repeating
groups
• A record is in first-normal form (1NF) if no repeating
groups exist and it has a primary key
• Second-normal form (2NF) is achieved if the record is in
1NF and has no partial dependencies
• After a record is in 2NF and all transitive dependencies have
been removed, then it is in third-normal form (3NF), which
is generally sufficient for most databases
Dr. Chen, Business Database Systems
87
Homework: Transform it into 3NF
SALES relation with simple data
SALES
Cust_ID
Name
Salesperson
Region
8023
Anderson
101
South
9167
Bancroft
102
West
7924
Hobbs
101
South
6837
Tucker
103
East
8596
Eckersley
102
West
7018
Arnold
104
North
Dr. Chen, Business Database Systems