DBdesignx - University of Oregon

Download Report

Transcript DBdesignx - University of Oregon

Relational Database Design
DSC340
Mike Pangburn
Difficulties of Managing Data
 Data are scattered and collected by many individuals
using various methods and devices.
 Data come from many sources including internal sources,
personal sources and external sources.
 Vast amounts of data can be collected and processed
 e.g., Amazon.com collecting clickstream data.
What is a relational database?
 You have all heard of the major players in the
RDBMS (relational database mgmt system) market
 E.g., Oracle, MySQL, Microsoft SQL-Server
 What makes a database relational?
 Relational db’s are organized around _______
 Earlier database systems (“hierarchical”, “network”) were
harder to use, and never developed the popularity of
relational systems
Database Design terminology
 Entity is a kind of thing you want to store information
about (e.g., students, courses, buildings, etc.).
 Attribute is a characteristic or quality of a particular
entity
 Note: same idea as a class attribute in programming –
a piece of information out an object
 Primary key is a field that uniquely identifies that
record.
 Foreign keys are fields that have reflect identifying
information from another entity (allowing you to
“look up” the associated entity).
What are relationships between entities?
 A relationship between 2 database entities
represents the possibility that two entities may be
associated with each other
 E.g., a customer may rent a video
 There are 3 basic types of relationships:
 1-to-1
 1-to-Many
 Many-to-Many
Relationship types
 One-to-one
 when a single instance (row) in the first table can be related to at
most one row in the second table, and vice versa
 One-to-many
 when a single instance (row) in the first table can be related to
many rows in the second table, AND
 when a single instance (row) in the second table can be related
to at most one row in the first table
 Many-to-many
 when a single instance (row) in the first table can be related to
multiple rows in the second table, and vice versa
Relationship types
 Examples below for one-to-many and many-
to-many
Store
(storeId)
Owns
Video
(videoId)
101
3
Customer
Video
(accountId) PreviouslyRented (videoId)
115
101
123
123
102
145
103
5
145
12
90987
104
90987
13
99787
99787
The database design process
 A good order in which to ask yourself
some questions…
 What real-world entities (“things”) am I
collecting information about? Each entity will
become a table (think of a table as a list of things)
 What pieces of information (attributes)
about those entities do I wish to store?
 Also, what subset of that information will
be used to identify the things (e.g., SSN
would be a common choice for a table
containing people things)
 How are the entities related?
We need a key (an “identifier”) for the
things in each table (list)
Table
Plausible Identifier (a “key”)
Customer of brick-and-mortar
video store
account ID
Online customer of video store
e-mail address
Working person in the United
States
Social Security number
DVD video
12-digit UPC bar code number
In the last case (books), you might need to track specific copies.
A simple design example
 You and several of your bird-loving neighbors want
to keep track of the birds that come to feed in the
various yards in the neighborhood.
 What things are we collecting information about?
 Each implies a separate database table
 What entities are in the bird-study problem?
 Birds
 Birdfeeders
 Yards
List (table) of Yards
2
1
3
Yards
Yard Number
Owner
4
Address
Phone Number
List (table) of Birdfeeders
Bird Feeders
Bird Feeder Number
Material
(Wood, Plastic, Metal)
Location in Yard
(Sun, Partial/Full Shade)
Table of Birds
Bird Data
Data Number
Date
Time
Bird Type
# of this type
Relationship: One Yard, Many Feeders
There must be one field in both tables
that is the same, so that the database
knows how the tables connect. In this
case, we need to know which yard each
bird feeder is located.
Bird Feeders
Bird Feeder Number
Material
Location in Yard
Yard Number
∞
Each Yard can have many
Birdfeeders, but each Birdfeeder
can only be in one Yard. This is
called a one to many (1 - ∞)
Relationship.
1
Yards
Yard Number
Owner
Address
Phone Number
Relationship: One Birdfeeder, Many Birds
Bird Data
Again, there must be one field in both
tables that is the same, so that the
database knows how the tables connect. In
this case one birdfeeder can be visited by
many birds.
Data Number
Date
Time
Bird Type
Bird Feeder Number
∞
Bird Feeders
Most relationships in the
world (of data) turn out to
be One-to-Many.
1
Bird Feeder Number
Material
Location in Yard
Yard Number
More details on Implementing
relationships
 One-to-one
 Add a foreign key column to one of the tables
 E.g., consider a university where each faculty member has
one office, and each office holds one faculty member
 Two design option: (1) include an “Room Num” foreign key
column in the faculty table, or (2) include a “Fac SSN” foreign
key column in the rooms table
These two tables show the first design option:
Faculty
Rooms
Fac SSN
Fac Name
Fac Phone
Room Num
046-25-5834
John Roon
506-3998
382
156-99-5933
Bill Hill
506-9322
2913
190-03-1944
Kiefer Zorn
506-4927
122
076-55-6822
Pinzer Mohan
506-9399
1001
Foreign Key column
Room Num
Building
Type
122
West Hall
Office
382
Condon Hall
Office
1001
High Tower
Office
2913
Gurt Building
Office
2940
Gurt Building
Class
More details on Implementing
relationships
 One-to-many
 Add a foreign key column to the table on the “many” side
 E.g., painter and paintings exhibit a 1-to-many relationship
 To capture the relationship, we add a “painter id” column to the
painting table
Painting
Painter
ID
Name
Deceased
Sex
2
Tore Rasmussen
No
M
3
Raechel Niel
Yes
F
4
Jon Smyth
Yes
M
5
Philis Stonips
No
F
Code
Title
Est.
Value
Type
PID
100
A Sense of Space
$2000
Water
2
101
The Road to Nowhere
$3000
Water
2
102
Reflections
$80,000
Oil
3
103
A Journey’s End
$55,000
Oil
4
104
Fisherman’s Catch
$40,000
Oil
4
105
Reflections
$45,000
Oil
4
106
Treescape
$5000
Oil
5
107
Veiled Visions
$4000
Oil
5
Foreign
Key column
More details on Implementing
relationships
 Many-to-many
 Add a bridge table “between” the two table which keeps
track of all row combinations
 E.g., at a university, the relationship between Students and
Classes is many-to-many
 One student can enroll in multiple classes
 One class will have multiple students
Classes
Students
Student SSN
Student Name
Student Phone
Class Num
Term
Room Num
Time
046-25-5834
Cosmo Rinker
541-844-9001
754
F05
2940
MW 8am
089-43-5569
Doogie Rupp
541-911-3579
755
F05
2940
MW 10am
076-55-6822
Wendy Kooper
541-074-4031
802
S06
2940
TTh 8am
075-83-9931
Curtis Ponzol
513-567-6772
813
S06
2940
TTh 10am
Adding a column to either table (or both) will not be adequate to
represent many students taking one course, and one course
having many students, so we need to add a “bridge table”
Why don’t we do this?
Classes
Class Num
Term
Room Num
Time
Students
754
F05
2940
MW 8am
046-25-5834, 089-435569
755
F05
2940
MW 10am
802
S06
2940
TTh 8am
813
S06
2940
TTh 10am
Students
Student SSN
Student Name
Student Phone
Classes
046-25-5834
Cosmo Rinker
541-844-9001
754, 802
089-43-5569
Doogie Rupp
541-911-3579
076-55-6822
Wendy Kooper
541-074-4031
075-83-9931
Curtis Ponzol
513-567-6772
Implementing relationship types
 Adding a “bridge table” to handle the many-tomany relationship
 it’s just a 2 column table of data!-nothing special!
Students
The Enrollment “bridge table” below
relates Students and Classes!
Student SSN
Student Name
Student Phone
046-25-5834
Cosmo Rinker
541-844-9001
089-43-5569
Doogie Rupp
541-911-3579
Enrollment
076-55-6822
Wendy Kooper
541-074-4031
Student SSN
Class Num
075-83-9931
Curtis Ponzol
513-567-6772
046-25-5834
754
046-25-5834
755
075-83-9931
755
Classes
Class Num
Term
Room Num
Time
075-83-9931
802
754
F05
294
MW 8am
075-83-9931
813
755
F05
211
MW 10am
089-43-5569
802
802
S06
111
TTh 8am
089-43-5569
813
813
S06
211
TTh 10am
076-55-6822
754
076-55-6822
802
Basic Database Design rules-of-thumb
 Information about different real-world entities should be
stored in different tables
 Splitting different kinds of information across different tables is
referred to as “normalized” design in the database realm
 Design should store information using the smallest logical
parts
 E.g., would it be better to have Street, City, State, ZIP fields
rather than one Address attribute?
 Each part defines a column
 Design should not store derived attributes
 Calculate derived values as needed
 E.g., would it be better to store birth-date or age ?
 Design should not cause blank (null) values
Implementing a database design
 Once you have designed the set of tables you want to
comprise your relational database, you need to create the
same within a RDMS (Relational Database Management
System) or equivalent
 For example, we could implement the Students, Classes, and
Enrollment tables
 We will use an online database system at reports.zoho.com
 After your design is implemented, you can create SQL queries
to extract your information from your database design
 We will discuss SQL next class
 Databases and SQL will be the focus of the next hw