marked - Kansas State University
Download
Report
Transcript marked - Kansas State University
Lecture 01 of 42
Database Architecture:
Client-Server Models, Relational DB Definitions
Wednesday, 27 August 2008
William H. Hsu
Department of Computing and Information Sciences, KSU
KSOL course page: http://snipurl.com/va60
Course web site: http://www.kddresearch.org/Courses/Fall-2008/CIS560
Instructor home page: http://www.cis.ksu.edu/~bhsu
Reading for Next Class:
Sections 2.1 – 2.3, Silberschatz et al., 5th edition
Syllabus and Introductory Handouts
CIS 560: Database System Concepts
Wedneday, 27 Aug
2008
Computing & Information Sciences
Kansas State University
Lecture Outline
Reading for Next Class: Sections 2.1 – 2.3, Silberschatz et al. 5e
Today and Friday: Basic Relational DB Principles
Relations
Database definitions
Records
Fields
Tables
Relations
Next Week: Relational Algebra and SQL Intro
Relational operators: PROJECT, SELECT, JOIN
Variations on relational joins
Implementation
Examples and Exercises
Look at Chapter 2 examples
Exercises: relational algebra formulas
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
Course Administration
Official Course Page (KSOL): http://snipurl.com/va60
Class Web Page: http://www.kddresearch.org/Courses/Fall-2008/CIS560
Instructional E-Mail Addresses
[email protected] (always use this to reach instructor)
[email protected] (this goes to everyone)
Instructor: William Hsu, Nichols 213
Office phone: +1 785 532 7905; home phone: +1 785 539 7180
IM: AIM/YIM/MSN hsuwh & rizanabsith, ICQ 28651394 & 191317559
Office hours: after class Mon/Wed/Fri; Tue; other times by appointment
Graduate Teaching Assistant: TBD
Office location: Nichols 124
Office hours: to be announced on class web board
Grading Policy
Hour exams: 15% each (in-class, closed-book); final (open-book): 30%
Machine problems, problem sets (8 of 10): 16%; term project: 17%
Class participation: 7% (3% attendance, 2% questions, 2% answers)
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review: Relational Databases,
Architectures, Platforms
This Course: Relational DBs, Queries, Indexing, Concurrency
Tools: (My)SQL & ORACLE, PHP & JSP
Database Management Systems (DBMS)
Storage and retrieval of information
Accessed using queries
Data Manipulation Languages (DMLs)
Data Description Languages (DDLs)
Client-Server Architecture
Relational Databases
Based on relations (tuples of entities)
Entities: objects organized into sets
Relationships
Links between entities
Correspond to mathematical relations
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Overall System Structure
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Relations Defined
•
•
•
•
A relation on sets S1, S2, …, Sn is a subset of S1 S2 … Sn
It consists of those tuples (s1, s2, …, sn) that are related
Finite example: set of pairs (a, b) in {1, 2, 3) such that a > b
Infinite example: >, set of pairs (a, b) of nonnegative integers
aka natural numbers such that a > b
CIS 560: Database System Concepts
1
1
2
2
3
3
A
B
Computing & Information Sciences
Kansas State University
Review:
Functions Defined
A function is a relation that takes an element from a set
and maps it to a unique element in another set
f maps R to Z
Domain
R
f
Z
Co-domain
(range)
f(4.3)
4
4.3
Pre-image of 4
Image of 4.3
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Function Example
A pre-image
of 1
The image
of A
Domain
Co-domain
Alice
A
“a”
1
Bob
B
“bb“
2
Chris
C
“cccc”
3
Dave
D
“dd”
4
Emma
F
“e”
5
A class grade function
A string length function
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Functions and Non-Function Relations
Range
a
1
“a”
1
e
2
“bb“
2
i
3
“cccc”
3
o
4
“dd”
4
u
5
“e”
5
Some function…
Not a valid function!
Also not a valid function!
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
Review:
One-to-One (Injective) Functions
•
•
•
A function is one-to-one if each element in the co-domain
(range) has a unique pre-image
aka into function
Every element y that is mapped into has an inverse f-1(y)…
but this is not necessarily every element in the co-domain
a
1
a
1
e
2
e
2
i
3
i
3
o
4
o
4
5
5
A one-to-one function
A function that is
not one-to-one
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
More on one-to-one
Injective is synonymous with one-to-one
“A function is injective”
A function is an injection if it is one-to-one
Note that there can
be un-used elements
in the co-domain
a
1
e
2
i
3
o
4
5
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
11
A one-to-one function
Computing & Information Sciences
Kansas State University
Review:
Onto (Surjective) Functions
•
•
A function is onto if each element in the co-domain (range) is an
image of some pre-image
Every element y with a unique pre-image has an inverse f-1(y)…
but this is not necessarily every element in the co-domain
a
1
a
1
e
2
e
2
i
3
i
3
o
4
o
4
u
5
An onto function
A function
that is not
onto
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
More on onto
Surjective is synonymous with onto
“A function is surjective”
A function is an surjection if it is onto
Note that there can
be multiply used
elements in the
co-domain
a
1
e
2
i
3
o
4
u
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
13
An onto function
Computing & Information Sciences
Kansas State University
Onto vs. one-to-one
Are the following functions onto, one-to-one, both, or neither?
a
1
a
1
b
2
b
2
c
3
c
3
4
d
4
1-to-1, not onto
Both 1-to-1 and onto
1
a
1
b
2
b
2
c
3
c
3
d
4
Onto, not 1-to-1
1
b
2
c
3
4
Not a valid function
a
d
a
Neither 1-to-1 nor onto
14
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
Computing & Information Sciences
Kansas State University
Review:
Bijections
Consider a function that is both one-to-one and onto
Such a function is a one-to-one correspondence, or a bijection
aka permutation aka invertible function
a
1
b
2
c
3
d
4
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
Inverse functions
Let f(x) = 2*x
R
f
R
f-1
f(4.3)
8.6
4.3
f-1(8.6)
Then f-1(x) = x/2
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
16
Computing & Information Sciences
Kansas State University
More on inverse functions
Can we define the inverse of the following
functions?
a
1
a
1
b
2
b
2
c
3
c
3
4
d
What is f-1(2)?
Not onto!
What is f-1(2)?
Not 1-to-1!
An inverse function can ONLY be defined on a
bijection
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
17
Computing & Information Sciences
Kansas State University
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Query Processing (Cont.)
Alternative ways of evaluating a given query
Equivalent expressions
Different algorithms for each operation
Cost difference between a good and a bad way of evaluating a
query can be enormous
Need to estimate the cost of operations
Depends critically on statistical information about relations which the
database must maintain
Need to estimate statistics for intermediate results to compute cost of
complex expressions
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Transaction Management
A transaction is a collection of operations that performs a
single logical function in a database application
Transaction-management component ensures that the
database remains in a consistent (correct) state despite system
failures (e.g., power failures and operating system crashes)
and transaction failures.
Concurrency-control manager controls the interaction among
the concurrent transactions, to ensure the consistency of the
database.
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Database Architecture
The architecture of a database systems is greatly influenced by
the underlying computer system on which the database is running:
Centralized
Client-server
Parallel (multi-processor)
Distributed
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
History of Database Systems
1950s and early 1960s:
Data processing using magnetic tapes for storage
Tapes provide only sequential access
Punched cards for input
Late 1960s and 1970s:
Hard disks allow direct access to data
Network and hierarchical data models in widespread use
Ted Codd defines the relational data model
Would win the ACM Turing Award for this work
IBM Research begins System R prototype
UC Berkeley begins Ingres prototype
High-performance (for the era) transaction processing
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
History (cont.)
1980s:
Research relational prototypes evolve into commercial systems
SQL becomes industrial standard
Parallel and distributed database systems
Object-oriented database systems
1990s:
Large decision support and data-mining applications
Large multi-terabyte data warehouses
Emergence of Web commerce
2000s:
XML and XQuery standards
Automated database administration
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Chapter 2: Relational Model
Structure of Relational Databases
Fundamental Relational-Algebra-Operations
Additional Relational-Algebra-Operations
Extended Relational-Algebra-Operations
Null Values
Modification of the Database
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Example of a Relation
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Basic Structure
Formally, given sets D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai
Di
Example: If
customer_name = {Jones, Smith, Curry, Lindsay}
customer_street = {Main, North, Park}
customer_city
= {Harrison, Rye, Pittsfield}
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield) }
is a relation over
customer_name x customer_street x customer_city
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Attribute Types
Each attribute of a relation has a name
The set of allowed values for each attribute is called the domain
of the attribute
Attribute values are (normally) required to be atomic; that is,
indivisible
Note: multivalued attribute values are not atomic
Note: composite attribute values are not atomic
The special value null is a member of every domain
The null value causes complications in the definition of many
operations
We shall ignore the effect of null values in our main presentation
and consider their effect later
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Relation Instance
The current values (relation instance) of a relation are
specified by a table
An element t of r is a tuple, represented by a row in a table
attributes
(or columns)
customer_name customer_street
Jones
Smith
Curry
Lindsay
Main
North
North
Park
customer_city
Harrison
Rye
Rye
Pittsfield
tuples
(or rows)
customer
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Relations are Unordered
n Order of tuples is irrelevant (tuples may be stored in an arbitrary order)
n Example: account relation with unordered tuples
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Database
A database consists of multiple relations
Information about an enterprise is broken up into parts, with
each relation storing one part of the information
account : stores information about accounts
depositor : stores information about which customer
owns which account
customer : stores information about customers
Storing all information as a single relation such as
bank(account_number, balance, customer_name, ..)
results in
repetition of information (e.g., two customers own an account)
the need for null values (e.g., represent a customer without an
account)
Normalization theory (Chapter 7) deals with how to design
relational schemas
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Summary
RDB Overview
Mathematical Relations
Based on sets (of entities)
Relation R: subset of Cartesian product of sets
Each member of R: one tuple
R specifies “which combinations are related”
Relational Algebra
Based on mathematical relations
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Terminology
Database Management Systems (DBMS)
Data Manipulation Languages (DMLs)
Data Description Languages (DDLs)
Client-Server Architecture
Relational Databases
Entity
Relationship
Relations
Subsets of Cartesian product of two or more sets
Functions
Functions
One-to-one (into function, injection)
Onto (surjection)
One-to-one & onto (bijection, permutation, invertible function)
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University