marked - Kansas State University

Download Report

Transcript marked - Kansas State University

Lecture 01 of 42
Database Architecture:
Client-Server Models, Relational DB Definitions
Wednesday, 27 August 2008
William H. Hsu
Department of Computing and Information Sciences, KSU
KSOL course page: http://snipurl.com/va60
Course web site: http://www.kddresearch.org/Courses/Fall-2008/CIS560
Instructor home page: http://www.cis.ksu.edu/~bhsu
Reading for Next Class:
Sections 2.1 – 2.3, Silberschatz et al., 5th edition
Syllabus and Introductory Handouts
CIS 560: Database System Concepts
Wedneday, 27 Aug
2008
Computing & Information Sciences
Kansas State University
Lecture Outline
 Reading for Next Class: Sections 2.1 – 2.3, Silberschatz et al. 5e
 Today and Friday: Basic Relational DB Principles
 Relations
 Database definitions
 Records
 Fields
 Tables
 Relations
 Next Week: Relational Algebra and SQL Intro
 Relational operators: PROJECT, SELECT, JOIN
 Variations on relational joins
 Implementation
 Examples and Exercises
 Look at Chapter 2 examples
 Exercises: relational algebra formulas
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
Course Administration
 Official Course Page (KSOL): http://snipurl.com/va60
 Class Web Page: http://www.kddresearch.org/Courses/Fall-2008/CIS560
 Instructional E-Mail Addresses
 [email protected] (always use this to reach instructor)
 [email protected] (this goes to everyone)
 Instructor: William Hsu, Nichols 213
 Office phone: +1 785 532 7905; home phone: +1 785 539 7180
 IM: AIM/YIM/MSN hsuwh & rizanabsith, ICQ 28651394 & 191317559
 Office hours: after class Mon/Wed/Fri; Tue; other times by appointment
 Graduate Teaching Assistant: TBD
 Office location: Nichols 124
 Office hours: to be announced on class web board
 Grading Policy
 Hour exams: 15% each (in-class, closed-book); final (open-book): 30%
 Machine problems, problem sets (8 of 10): 16%; term project: 17%
 Class participation: 7% (3% attendance, 2% questions, 2% answers)
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review: Relational Databases,
Architectures, Platforms
 This Course: Relational DBs, Queries, Indexing, Concurrency
 Tools: (My)SQL & ORACLE, PHP & JSP
 Database Management Systems (DBMS)
 Storage and retrieval of information
 Accessed using queries




Data Manipulation Languages (DMLs)
Data Description Languages (DDLs)
Client-Server Architecture
Relational Databases
 Based on relations (tuples of entities)
 Entities: objects organized into sets
 Relationships
 Links between entities
 Correspond to mathematical relations
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Overall System Structure
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Relations Defined
•
•
•
•
A relation on sets S1, S2, …, Sn is a subset of S1  S2  …  Sn
It consists of those tuples (s1, s2, …, sn) that are related
Finite example: set of pairs (a, b) in {1, 2, 3) such that a > b
Infinite example: >, set of pairs (a, b) of nonnegative integers
aka natural numbers such that a > b
CIS 560: Database System Concepts
1
1
2
2
3
3
A
B
Computing & Information Sciences
Kansas State University
Review:
Functions Defined
A function is a relation that takes an element from a set
and maps it to a unique element in another set
f maps R to Z
Domain
R
f
Z
Co-domain
(range)
f(4.3)
4
4.3
Pre-image of 4
Image of 4.3
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Function Example
A pre-image
of 1
The image
of A
Domain
Co-domain
Alice
A
“a”
1
Bob
B
“bb“
2
Chris
C
“cccc”
3
Dave
D
“dd”
4
Emma
F
“e”
5
A class grade function
A string length function
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Review:
Functions and Non-Function Relations
Range
a
1
“a”
1
e
2
“bb“
2
i
3
“cccc”
3
o
4
“dd”
4
u
5
“e”
5
Some function…
Not a valid function!
Also not a valid function!
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
Review:
One-to-One (Injective) Functions
•
•
•
A function is one-to-one if each element in the co-domain
(range) has a unique pre-image
aka into function
Every element y that is mapped into has an inverse f-1(y)…
but this is not necessarily every element in the co-domain
a
1
a
1
e
2
e
2
i
3
i
3
o
4
o
4
5
5
A one-to-one function
A function that is
not one-to-one
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
More on one-to-one
 Injective is synonymous with one-to-one
 “A function is injective”
 A function is an injection if it is one-to-one
 Note that there can
be un-used elements
in the co-domain
a
1
e
2
i
3
o
4
5
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
11
A one-to-one function
Computing & Information Sciences
Kansas State University
Review:
Onto (Surjective) Functions
•
•
A function is onto if each element in the co-domain (range) is an
image of some pre-image
Every element y with a unique pre-image has an inverse f-1(y)…
but this is not necessarily every element in the co-domain
a
1
a
1
e
2
e
2
i
3
i
3
o
4
o
4
u
5
An onto function
A function
that is not
onto
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
More on onto
 Surjective is synonymous with onto
 “A function is surjective”
 A function is an surjection if it is onto
 Note that there can
be multiply used
elements in the
co-domain
a
1
e
2
i
3
o
4
u
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
13
An onto function
Computing & Information Sciences
Kansas State University
Onto vs. one-to-one
 Are the following functions onto, one-to-one, both, or neither?
a
1
a
1
b
2
b
2
c
3
c
3
4
d
4
1-to-1, not onto
Both 1-to-1 and onto
1
a
1
b
2
b
2
c
3
c
3
d
4
Onto, not 1-to-1
1
b
2
c
3
4
Not a valid function
a
d
a
Neither 1-to-1 nor onto
14
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
Computing & Information Sciences
Kansas State University
Review:
Bijections
Consider a function that is both one-to-one and onto
Such a function is a one-to-one correspondence, or a bijection
aka permutation aka invertible function
a
1
b
2
c
3
d
4
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
CIS 560: Database System Concepts
Monday, 25 Aug 2008
Computing & Information Sciences
Kansas State University
Inverse functions
Let f(x) = 2*x
R
f
R
f-1
f(4.3)
8.6
4.3
f-1(8.6)
Then f-1(x) = x/2
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
16
Computing & Information Sciences
Kansas State University
More on inverse functions
 Can we define the inverse of the following
functions?
a
1
a
1
b
2
b
2
c
3
c
3
4
d
What is f-1(2)?
Not onto!
What is f-1(2)?
Not 1-to-1!
 An inverse function can ONLY be defined on a
bijection
Adapted from slides © 2005 A. Bloomfield, University of Virginia
CS202 Discrete Mathematics
http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/
17
Computing & Information Sciences
Kansas State University
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Query Processing (Cont.)
 Alternative ways of evaluating a given query
 Equivalent expressions
 Different algorithms for each operation
 Cost difference between a good and a bad way of evaluating a
query can be enormous
 Need to estimate the cost of operations
 Depends critically on statistical information about relations which the
database must maintain
 Need to estimate statistics for intermediate results to compute cost of
complex expressions
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Transaction Management
 A transaction is a collection of operations that performs a
single logical function in a database application
 Transaction-management component ensures that the
database remains in a consistent (correct) state despite system
failures (e.g., power failures and operating system crashes)
and transaction failures.
 Concurrency-control manager controls the interaction among
the concurrent transactions, to ensure the consistency of the
database.
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Database Architecture
The architecture of a database systems is greatly influenced by
the underlying computer system on which the database is running:
 Centralized
 Client-server
 Parallel (multi-processor)
 Distributed
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
History of Database Systems
 1950s and early 1960s:
 Data processing using magnetic tapes for storage
 Tapes provide only sequential access
 Punched cards for input
 Late 1960s and 1970s:
 Hard disks allow direct access to data
 Network and hierarchical data models in widespread use
 Ted Codd defines the relational data model
 Would win the ACM Turing Award for this work
 IBM Research begins System R prototype
 UC Berkeley begins Ingres prototype
 High-performance (for the era) transaction processing
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
History (cont.)
 1980s:
 Research relational prototypes evolve into commercial systems
 SQL becomes industrial standard
 Parallel and distributed database systems
 Object-oriented database systems
 1990s:
 Large decision support and data-mining applications
 Large multi-terabyte data warehouses
 Emergence of Web commerce
 2000s:
 XML and XQuery standards
 Automated database administration
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Chapter 2: Relational Model






Structure of Relational Databases
Fundamental Relational-Algebra-Operations
Additional Relational-Algebra-Operations
Extended Relational-Algebra-Operations
Null Values
Modification of the Database
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Example of a Relation
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Basic Structure
 Formally, given sets D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai 
Di
 Example: If
customer_name = {Jones, Smith, Curry, Lindsay}
customer_street = {Main, North, Park}
customer_city
= {Harrison, Rye, Pittsfield}
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield) }
is a relation over
customer_name x customer_street x customer_city
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Attribute Types
 Each attribute of a relation has a name
 The set of allowed values for each attribute is called the domain
of the attribute
 Attribute values are (normally) required to be atomic; that is,
indivisible
 Note: multivalued attribute values are not atomic
 Note: composite attribute values are not atomic
 The special value null is a member of every domain
 The null value causes complications in the definition of many
operations
 We shall ignore the effect of null values in our main presentation
and consider their effect later
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Relation Instance
 The current values (relation instance) of a relation are
specified by a table
 An element t of r is a tuple, represented by a row in a table
attributes
(or columns)
customer_name customer_street
Jones
Smith
Curry
Lindsay
Main
North
North
Park
customer_city
Harrison
Rye
Rye
Pittsfield
tuples
(or rows)
customer
CIS 560: Database System Concepts
Wednesday, 24 Jan
2008
Computing & Information Sciences
Kansas State University
Relations are Unordered
n Order of tuples is irrelevant (tuples may be stored in an arbitrary order)
n Example: account relation with unordered tuples
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Database
 A database consists of multiple relations
 Information about an enterprise is broken up into parts, with
each relation storing one part of the information
account : stores information about accounts
depositor : stores information about which customer
owns which account
customer : stores information about customers
 Storing all information as a single relation such as
bank(account_number, balance, customer_name, ..)
results in
 repetition of information (e.g., two customers own an account)
 the need for null values (e.g., represent a customer without an
account)
 Normalization theory (Chapter 7) deals with how to design
relational schemas
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Summary
 RDB Overview
 Mathematical Relations




Based on sets (of entities)
Relation R: subset of Cartesian product of sets
Each member of R: one tuple
R specifies “which combinations are related”
 Relational Algebra
 Based on mathematical relations
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University
Terminology





Database Management Systems (DBMS)
Data Manipulation Languages (DMLs)
Data Description Languages (DDLs)
Client-Server Architecture
Relational Databases
 Entity
 Relationship
 Relations
 Subsets of Cartesian product of two or more sets
 Functions
 Functions
 One-to-one (into function, injection)
 Onto (surjection)
 One-to-one & onto (bijection, permutation, invertible function)
CIS 560: Database System Concepts
Computing & Information Sciences
Kansas State University