Transcript Lecture 6

Relational Data
Model
Sahar Mosleh
California State University San Marcos
Page
1
Data Model
• Structures and access techniques provided by a particular
database management system (DBMS) are called its data model
• DBMS became popular from 1970 to 1980
• Each model had its advantages and its disadvantages
• In many ways the relational data model represented an attempt
to simplify the earlier data models
Sahar Mosleh
California State University San Marcos
Page
2
File Management Systems
• Before the existence of DBMS, the data were stored in
separate files.
• There was no link from one file to another
• If the structure of the data changed (ex: adding more fields),
programs that were using the file had to change
• Problems became more severe when the number of the
programs using the files increased over time
Sahar Mosleh
California State University San Marcos
Page
3
• This is an example of payroll application using a file
management System. (FD is the file descriptor in the picture)
Employee Update Program
FD
EmpId EmpName Address City ……………….
Employee
Master
File
Employee Report Program
FD
EmpId EmpName Address City ……………….
Check-writing Program
FD
EmpId EmpName Address City ……………….
Timecard
File
FD
EmpId HoursWorked Wage
Sahar Mosleh
……………….
California State University San Marcos
Page
4
Hierarchical Database
• One of the most important application for the earliest database
management system was production of manufacturing companies
such as car companies
• Car companies wanted to produce 10000 units of one model, 5000
of another model and so on.
• They needed a database to collect all of this information and
required tools to retrieve the information as fast as possible.
• Information had to be organized in a form of parent and child
structure:
• a car is composed of a motor, body , etc.
• Similarly each subpart were composed of other subparts and so
on. Ex: an engine has spark plugs, wires, etc..
Sahar Mosleh
California State University San Marcos
Page
5
Car2
Car1
Engine
Body
Transmission
Engine
Body
right
Door
Hood
Roof
Left
Door
Window Lock
Sahar Mosleh
Engine
Body
right
Door
Hood
Handle
Window Lock
Transmission
…...
Roof
Left
Door
……
……
Handle
Transmission
…...
…...
Left
Door
Car3
right
Door
Hood
……
Handle
Window Lock
California State University San Marcos
Roof
…..
…..
…..
…..
…..
…..
…..
…..
….
….
Page
6
• Hierarchical model uses upside down trees.
• A tree represents parent/child relationships
• For example, a car consists of body, engine, transmission, etc.
• The root of the tree and all of its associated children would make
one record (specifying a particular item: i.e a car)
• Pointers were used to link a parent to its children or a child to
another child
• In general, to access the data, a program would
• Find a particular car
• Move down to the first child
• Move up to its parent
• Move sideways to the next child
• Retrieving the data in a hierarchical database required navigating
through the records, moving up, down, and sideways one record at
the time
• The most popular hierarchical database was Information
Management System (IMS) introduced in 1968
Sahar Mosleh
California State University San Marcos
Page
7
Advantages of Hierarchical database:
• Simple Structure:
• Organization of IMS database was easy to understand
• Parent/Child Organization:
• IMS was a good model for representing parent/child
relationship (A is part of B) or (A is owned by B)
• Performance:
• IMS stored parent/child relationships as physical
pointers from one data record to another
• This would speed up the movement from one record to
another
• IMS is still widely used. It is still considered as a good DBMS
in places with lots of transactions processing (ex: banks)
Sahar Mosleh
California State University San Marcos
Page
8
Network Databases
• Hierarchical database could not answer the demand of some
business oriented environment.
• For example, in an order processing company, a single order
might participate in more than one parent/child relationship.
• For instance, a particular order should be linked to
• The customer who placed it
• The sales person who took it
• The product ordered
• This could not be done by IMS
• To deal with these situations, network data model was
developed: children could have more than one parent
Sahar Mosleh
California State University San Marcos
Page
9
.
.
.
Customers
Car
Car
Acme
Mfg
.
.
.
.
.
.
Products
Salespeople
Car
Car
Bill
Adams
Car
Size Car
4
Widget
.
.
.
Car
Car
#112963
Orders
• Example of parent/child relationship in network database models
Sahar Mosleh
California State University San Marcos
Page 10
• In 1971, the conference on the systems languages published an
official standard for network databases which became known
as CODASYL model
• A programmers would access the network database as follows:
• Find a specific parent record by key (ex: customer number)
• Move down to the first child in a particular set (the first
order placed by this customer
• Move sideways from one child to the next in the set (the
next order placed by this customer)
• Move up from a child to its parent in another set ( the
salesperson who took the order)
Sahar Mosleh
California State University San Marcos
Page 11
Advantages of Network database
• Flexibility:
• Multiple parent/child relationships allowed a network database
to represent data that did not have a simple hierarchical
structure
• Standardization:
• The CODASYL standard boosted the popularity of the network
model, and minicomputer vendors such as Digital Equipment
Corporation and Data General implemented network Databases
• Performance:
• Despite their greater complexity, network databases boasted
performance approaching that of hierarchical databases.
Sahar Mosleh
California State University San Marcos
Page 12
Disadvantages of Hierarchical and Network
models
• They have rigid structure:
• The structure of the records had to be known in advance.
• Changing the database structure required rebuilding the entire
database
• Querying the database was not always easy. Retrieving simple
information form the database could cause programmer to write
lots of code
• Some of this code was quite complicated
Sahar Mosleh
California State University San Marcos
Page 13
Relational Data Model
• Disadvantages of hierarchical and network models led to
creation of Relational Data Model by Dr. Codd in 1970
• In this course, we teach Relational data model and use
Structured Query Language (SQL) used to manipulate the data
in the database.
Definition:
• A relational database is a database where all data visible to
users is organized strictly as tables of data values and
where all database operations work on these tables
• In this model information is stored in a database as simple
row/column tables of data
• Next slide shows an example of tables in a relational database
Sahar Mosleh
California State University San Marcos
Page 14
SalesReps
Salesrep Name
Age RepOffice Title
HireDate Manager Quota Sales
105
Bill Adams
37
13
Sales Rep 12-FEB-88 104 350000 367911
109
Mary Jones
31
11 Sales Rep 12-OCT-89 106 300000 392725
102
Sue Smith
48
21 Sales Rep 10-DEC-86 108 350000 474050
Orders
106
Sam Clark
52
11 VP
Sales 14-JUN-88
275000 299912
104
Bob Smith
33 12 Sales
Mgr 19-MAY-87
142594
OrderNum
OrderDate 106
Cust 200000
Rep MFR
Product QTY Amount
101
Dan Roberts
45 12 Sales
Rep 20-OCT-86
112961
17-DEC-89 104
2117 300000
106 REI305673
2A44L 7
31500
Products
11-JAN-90 101
2111
105 ACI75985
41003 35
3745
110
Tom Synder
41
Sales113012
Rep 13-JAN-90
112989
03-JAN-90
2101 350000
106 FEA
114
6
1458
ProductId
QtyOnHand
108 MfrIdLarry
Fitch Description
62 21
Sales
Mgr Price
12-OCT-89
106
361865
113051 7910-FEB-90
2118
108 QSA K47
1420
2A45C
RATCHET
210 104
103 REI Paul
Cruz
29 12 LINK
Sales
Rep 01-MAR-87
275000
286775 4
112968 12-OCT-89
2102
101 ACI 41004 34
3978
ACI
4100Y
25
1
Nacy
Angelli WIDGET
49 22REMOVER
Sales113036
Rep 2750
14-NOV-88
108
300000
186042
30-JAN-90
2107
110 ACI 4100Z 9
22500
Customers
REDUCER
355
38
…. QSA XK47
113045 02-FEB-90
2112
108 REI 2A44R 10
45000
41672
PLATE
18017-DEC-89
0
…. BIC
112963
105 ACI
41004 28
3276
CustNum
Company 2103CustRep
Credit_limit
IMM 779C
900-LB BRACE 113013
187514-JAN-90
9
2118
108 BIC 41003 1
652
2111 JCP
Inc.
103
50000
2108
109 FEA 112
10
1480
ACI
41003
SIZE 3 WIDGET 113058
10723-FEB-90
207
2102
First
Corp.
101
65000
2124
107 BIC 41003 1
652
ACI
41004
SIZE 4 WIDGET 11299711708-JAN-90
139
2103
Acme
Mfg.
105
50000
11298365227-DEC-89
2103
105 ACI 41004 6
702
BIC
41003
HANDLE
3
2123 Carter
and Sons2114
102
4000020
Offices
113024
20-JAN-90
108 QSA XK47
7100
IMM 887P
BRACE PIN
250
24
113062
24-FEB-90
2124
107 FEA 114
2430
2107 Ace
International
110
3500010
Region 112979
Mgr134
Target
Sales
QSA Office
XK48 City REDUCER
203
12-OCT-89
102 ACI 4100Z
6
15000
2115 Smithson
Corp. 2114
101
20000
Western ……
108 4500
300000 12
186042
REI 22
2A44L Denver
LEFT HINGE
2101 Jones Mfg.
106
65000
11
New York Eastern 106 575000 692637
….
2112 Zetacorp
108
50000
12
Chicago
Eastern 104 800000 735042
2121 QMA Assoc.
103
45000
13
Atlanta
Eastern 105 350000 367911
2114 Orion Corp.
102
20000
21
Los Angeles Western 108 725000 835915
……
Sahar Mosleh
California State University San Marcos
Page 15
Table
Definition:
• A table is a rectangular object with rows and columns
• For example in the office table:
• Each row of the office table represents a single physical entity
• Each column of the offices table represents one item of data
that is stored in the database for each office:
• Ex: City column represents the location of the office
• An alternative term for column is attribute
• Each row of the table contains exactly one data value in each
column
Sahar Mosleh
California State University San Marcos
Page 16
• In each column of a table, all of the data values in that column
have the same type. For example:
• City column values are words
• Sales values are money type
• Mgr values are integer
• Each column in a table has a column name which is written as a
heading at the top of the column
• Column names must be unique in a table
• The columns of a table have a left-right order. That is defined
when the table is first created.
• The order of the column has no effect when any action is done
against the table
Sahar Mosleh
California State University San Marcos
Page 17
• Each table must have at least one column
• Almost all commercial DBMS products impose maximum of
255 columns per table
• A table can have zero or more rows
• A table with zero rows is called an empty table
• Order of the rows is not important in a table.
• Most relational DBMSs either do not impose any limit on the
number of rows or their limit is a very large number
• A common limit is approximately 2 billion rows
Sahar Mosleh
California State University San Marcos
Page 18
Primary Key
Definition:
• A primary key is a column or combination of two or more
columns that uniquely identifies each row of a table.
• Since the order of rows in a table is irrelevant, rows cannot be
identified based on their positions in a table
• Ex: row 1, row 2, row 20
• In a well-designed relational database each table has a primary
key.
• If the primary key contains two or more columns, it is called a
composite primary key
Sahar Mosleh
California State University San Marcos
Page 19
Example of primary key
• Consider the Offices table
Office
22
11
12
13
21
City
Denver
New York
Chicago
Atlanta
Los Angeles
Region
Western
Eastern
Eastern
Eastern
Western
Mgr
108
106
104
105
108
Target
300000
575000
800000
350000
725000
Sales
186042
692637
735042
367911
835915
• “Office” column (attribute) can be a good choice for the primary
key because each office has a different office id
• However, city is not a good choice because more than one office
may be located in the same city.
Sahar Mosleh
California State University San Marcos
Page 20
• Consider the Products table
MfrId ProductId
REI 2A45C
ACI 4100Y
QSA XK47
BIC 41672
IMM 779C
ACI 41003
ACI 41004
BIC 41003
IMM 887P
QSA XK48
REI 2A44L
….
Description
RATCHET LINK
WIDGET REMOVER
REDUCER
PLATE
900-LB BRACE
SIZE 3 WIDGET
SIZE 4 WIDGET
HANDLE
BRACE PIN
REDUCER
LEFT HINGE
Price
79
27.50
355
180
1875
107
117
652
250
134
4500
QtyOnHand
210
25
38
0
9
207
139
3
24
203
12
• What is a good primary key for this table?
Sahar Mosleh
California State University San Marcos
Page 21
• In his case, MrfId by itself, is not a good choice to be a primary
key because more than one manufacturer may produce more than
one product
• Further, ProductId by itself is not a good choice either because the
same product can be produced by more than one manufacturer.
• However, combination of both is unique in every row.
• This is an example of composite primary key.
• A table with a primary key is called a relation. A relation is a table
in which no duplicate rows can exist.
Sahar Mosleh
California State University San Marcos
Page 22
Relationship
• In contrast with the older data models (hierarchical and
network models) relational data models do not include explicit
parent/child pointer connection
• Then how does the parent/child model is represented in the
relational data model?
• Consider the following two tables
Offices
SalesReps
Salesrep
105
109
102
106
….
….
Name
Bill Adams
Mary Jones
Sue Smith
Sam Clark
Sahar Mosleh
Office City
Regin
Mgr Target Sales
22
Denver
Western 108 300000 186042
11
New York
Eastern 106 575000 692637
12
Chicago
Eastern 104 800000 735042
13
Atlanta
Eastern 105 350000 367911
Age RepOffice Title
Manager
Quota
21 HireDate
Los Angeles
Western
108 Sales
725000 835915
37
13
Sales Rep 12-FEB-88 104 350000 367911
31
11 Sales Rep 12-OCT-89 106 300000 392725
48
21 Sales Rep 10-DEC-86 108 350000 474050
52 11 VP Sales 14-JUN-88
275000 299912
California State University San Marcos
Page 23
• The parent is the offices table
• The child is the salesreps table because the salesreps works in
an office
• Relationships are created by having the same data in two or
more tables
Offices
SalesReps
Salesrep
105
109
102
106
….
….
Sahar Mosleh
Name
Bill Adams
Mary Jones
Sue Smith
Sam Clark
Office City
Regin Mgr Target Sales
22
Denver
Western 108 300000 186042
11
New York
Eastern 106 575000 692637
12
Chicago
Eastern 104 800000 735042
Age RepOffice Title
HireDate
Manager
Quota
13
Atlanta
Eastern
105 Sales
350000 367911
37
13
Sales
Rep
12-FEB-88
104
350000
367911
21
Los Angeles Western 108 725000 835915
31
11 Sales Rep 12-OCT-89 106 300000 392725
48
21 Sales Rep 10-DEC-86 108 350000 474050
52 11 VP Sales 14-JUN-88
275000 299912
California State University San Marcos
Page 24
• Note that the RepOffice column salesreps table contains the office
number of the sales office where each sales person works
• The values of the RepOffice column is the set of office numbers
found in the office column of the offices table
• We will see how this restriction is imposed when we discuss
about creating tables later in the course
• For example, it is possible to find the sales office where “Mary
Jones” is working by finding the value of Mary Jones RepOffice
(11) and finding the corresponding row offices table
• So, the parent/child relationship between two tables A and B is not
represented by explicit pointers but by common data values stored
in the two tables
• Programmers must specify this relationship when they create the
tables
Sahar Mosleh
California State University San Marcos
Page 25
Foreign Key
Definition:
• Foreign key is a column (or combination or two or more
columns) whose value matches the primary key of another table
• Together, primary key and the foreign key make the parent/child
relationship in relational data models
Primary key
Offices
Office City
Regin
Mgr Target Sales
22
Denver
Western 108 300000 186042
11
New York
Eastern 106 575000 692637
12
Chicago
Eastern 104 800000 735042
Age RepOffice Title13 HireDate
Manager
Quota
Atlanta
Eastern
105 Sales
350000 367911
37
13
Sales Rep
104Western
350000108367911
21 12-FEB-88
Los Angeles
725000 835915
31
11 Sales Rep 12-OCT-89 106 300000 392725
48
21 Sales Rep 10-DEC-86 108 350000 474050
52 11 VP Sales 14-JUN-88
275000 299912
Foreign key
SalesReps
Salesrep
105
109
102
106
….
….
Sahar Mosleh
Name
Bill Adams
Mary Jones
Sue Smith
Sam Clark
California State University San Marcos
Page 26
Example of Foreign Key
Customers
SalesReps
Products
CustNum
2111 …..
2102 ….
2103 ….
……….
…….
Salesrep
105
109
102
106
….
MfrId ProductId
REI
2A45C …..
ACI
4100Y …..
QSA XK47 ……
….
…..
Orders
OrderNum
11296
113012
112989
113051
….
Sahar Mosleh
OrderDate
17-DEC-89
11-JAN-90
03-JAN-90
10-FEB-90
Name
Bill Adams
Mary Jones
Sue Smith
Sam Clark
Cust Rep MFR Product QTY Amount
2117 106 REI 2A44L 7
31500
2111 105 ACI 41003 35
3745
2101 106 FEA 114
6
1458
2118 108 QSA K47
4
1420
California State University San Marcos
Page 27