Lecture Notes #1

Download Report

Transcript Lecture Notes #1

CPSC-608 Database Systems
Fall 2015
Instructor: Jianer Chen
Office: HRBB 315C
Phone: 845-4259
Email: [email protected]
Notes 1
users
database system
large
volume
of
data
where do we
store them?
users
database system
large
volume
of
data
in secondary
storage
users
database system
secondary
storage
(disks)
how are the
data organized?
users
database system
secondary
storage
(disks)
in tables
(relations)
users
database system
secondary
storage
(disks)
how do we
define
relations?
users
in tables
(relations)
database system
secondary
storage
(disks)
in tables
(relations)
database
administrator
users
DDL
language
database system
secondary
storage
(disks)
how do we
manipulate relations?
database
administrator
users
in tables
(relations)
DDL
language
database system
secondary
storage
(disks)
in tables
(relations)
DDL
language
database
administrator
database system
DML (query)
language
database
programmer
secondary
storage
(disks)
in tables
(relations)
DDL
language
database
administrator
database management system
DML (query)
language
database
programmer
simply translate
database programs
into machine programs
secondary
storage
(disks)
in tables
(relations)
DDL
language
database
administrator
database management system
DML (query)
language
database
programmer
simply translate
database programs
into machine programs
secondary
storage
(disks)
then what is the difference between DBMS
and a programming language compiler?
1. it has to deal with data stored in
hierarchical memory structures
in tables
(relations)
DDL
language
database
administrator
file
manager
buffer
manager
DML (query)
language
database
programmer
main
memory
buffers
DBMS
then what is the difference between DBMS
and a programming language compiler?
secondary
storage
(disks)
2. it has to support efficient manipulations
of data in hierarchical memory structures
in tables
(relations)
DDL
language
database
administrator
file
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
main
memory
buffers
DBMS
then what is the difference between DBMS
and a programming language compiler?
secondary
storage
(disks)
3. it needs to translate the input database
program into an internal representation
DDL
language
database
administrator
DDL
complier
in tables
(relations)
file
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
main
memory
buffers
DBMS
then what is the difference between DBMS
and a programming language compiler?
secondary
storage
(disks)
4. it needs to produce efficient internal codes
dealing with data in hierarchical memory structure
DDL
language
database
administrator
DDL
complier
in tables
(relations)
file
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
query
execution
engine
main
memory
buffers
DBMS
then what is the difference between DBMS
and a programming language compiler?
secondary
storage
(disks)
in tables
(relations)
5. it needs to be consistent
DDL
language
database
administrator
lock table
DDL
complier
concurrency
control
file
manager
transaction
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
query
execution
engine
main
memory
buffers
DBMS
then what is the difference between DBMS
and a programming language compiler?
secondary
storage
(disks)
in tables
(relations)
6. it needs to be reliable
DDL
language
database
administrator
lock table
DDL
complier
concurrency
control
logging &
recovery
file
manager
transaction
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
query
execution
engine
main
memory
buffers
DBMS
then what is the difference between DBMS
and a programming language compiler?
secondary
storage
(disks)
in tables
(relations)
DDL
language
database
administrator
lock table
DDL
complier
concurrency
control
logging &
recovery
file
manager
transaction
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
DBMS
query
execution
engine
main
memory
buffers
secondary
storage
(disks)
in tables
(relations)
lock table
DDL
language
DDL
complier
database
administrator
concurrency
control
logging &
recovery
file
manager
transaction
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
DBMS
undergraduate database
query
execution
engine
main
memory
buffers
secondary
storage
(disks)
in tables
(relations)
DDL
language
database
administrator
lock table
DDL
complier
concurrency
control
logging &
recovery
file
manager
transaction
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
query
execution
engine
main
memory
buffers
DBMS
graduate database
secondary
storage
(disks)
in tables
(relations)
DDL
language
database
administrator
lock table
DDL
complier
concurrency
control
logging &
recovery
file
manager
transaction
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
query
execution
engine
main
memory
buffers
DBMS
graduate database
secondary
storage
(disks)
A Quick Review on
Undergraduate Database
in tables
(relations)
DDL
language
database
administrator
lock table
DDL
complier
concurrency
control
logging &
recovery
file
manager
transaction
manager
index/file
manager
buffer
manager
DML (query)
language
database
programmer
DML
complier
DBMS
query
execution
engine
main
memory
buffers
secondary
storage
(disks)
We have agreed
Information (i.e., database) is organized
in tables (i.e., relations) stored in disks.
We have agreed
Information (i.e., database) is organized
in tables (i.e., relations).
● How is information represented by
relations?
● What are “good” table structures?
● What operations can we apply on
tables?
How is information represented
by relations?
How is information represented
by relations?
Information consists of
● objects (i.e., entities) plus
● connections (i.e., relationships) among
entities
How is information represented
by relations?
Information consists of
● objects (i.e., entities) plus
● connections (i.e., relationships) among
entities
Thus, information can be given by
Entity/relationship (R/E) diagrams
How is information represented
by relations?
Information consists of
● objects (i.e., entities) plus
● connections (i.e., relationships) among
entities
Thus, information can be given by
Entity/relationship (R/E) diagrams
Read: Sections 4.1-4.4
The Entity/Relationship Model
How to convert E/R diagrams
into relations (i.e., tables)?
How to convert E/R diagrams
into relations (i.e., tables)?
Fairly straightforward:
How are E/R diagrams converted
into relations (i.e., tables)?
Fairly straightforward:
● an entity set is given by a table where
each column corresponds to a property
(i.e., attribute) of the entities;
● a relationship among entities is given
by a table whose columns correspond
to the identifications of the related
entities (that now become attributes).
How are E/R diagrams converted
into relations (i.e., tables)?
Fairly straightforward:
● an entity set is given by a table where
each column corresponds to a property
(i.e., attribute) of the entities;
● a relationship among entities is given
by a table whose columns correspond
to the identifications of the related
entities (that now become attributes).
Read: sections 4.5.
From E/R Diagrams to Relational Designs
What are “good” table strcutures?
● have no inconsistency;
● avoid redundancy;
● easy to use
What are “good” table strcutures?
● have no inconsistency;
● avoid redundancy;
● easy to use
Typical questions:
● Should we split a table when it is too fat?
● Should we merge tables when they are too
thin?
What are “good” table strcutures?
● have no inconsistency;
● avoid redundancy;
● easy to use
Typical questions:
● Should we split a table when it is too fat?
● Should we merge tables when they are too
thin?
Read: Chapter 3.
Design Theory for Relational Databases
Some terminology
Relation, attribute, tuples
Attributes
(column
headers)
Tuples
(rows)
name
manf
Winterbrew
Pete’s
Bud Lite
Anheuser-Busch
Beers
a relation
Some terminology
Keys and superkeys
• Superkey: a set of attributes that
uniquely determines a tuple;
• Key: a superkey that does not contain
any smaller superkey.
Some terminology
• Relation schema: relation name and
attribute list.
• Database schema: set of all relation
schemas in the database.
• Database: collection of relations.
Relational operations
Typically, selecting tuples that meet a
given condition.
Relational operations
Core relational operations:
• Union, intersection, and difference.
– Usual set operations;
– Extended to bags
• Selection: picking certain rows.
• Projection: picking certain columns.
• Products and joins: compositions of
relations.
• Renaming of relations and attributes.
Relational operations
extended relational operations:
δ = eliminate duplicates from bags.
●τ = sort tuples.
●γ = grouping and aggregation.
●
Relational operations
extended relational operations:
δ = eliminate duplicates from bags.
●τ = sort tuples.
●γ = grouping and aggregation.
●
Read: Chapter 5
Algebraic and Logical Query Languages