Introduction to Database Principles http://cbb.sjtu.edu.cn
Download
Report
Transcript Introduction to Database Principles http://cbb.sjtu.edu.cn
Chapter 2: Relational Algebra
-- Introduction to database principles
MaoyingWu ([email protected])
March 6, 2013
Relation terminology
Jargon (专业术语)
Vulgo (俗称)
Relation name (关系名)
Table name (表名)
Schema (关系模式)
Table header (表头/表描述)
Relation (关系)
2-d table (二维表)
Tuple (元组)
Row (行)
Attribute (属性)
Column (列)
Attribute name (属性名)
Column name (列名)
Attribute value (属性值)
Column value (列值)
Domain (域)
Column type (列类型)
Learning strategies
Know how => know why
适当的不求甚解
After-class assignment 1
Install MySQL on your own computer
Linux: from source package or LAMP
Windows: from rebuilt package or XAMP
Learn some basic knowledge on HTML/PHP and B/S
development
Review of C programming skills
DML (数据操作语言)
Data Manipulation Language
Language for accessing and manipulating the data organized
by appropriate data model
aka, query language (查询语言)
Two classes of languages
procedural (过程性): specifies what data is required and how to get
these data
declarative (声明性): specifies what data is required without
knowing how to get these data
SQL: most widely used query language
DDL (数据定义语言)
Data-defintion Language
Specified notation for defining the database schema
Example:
create table account(
account_no
char(10),
balance integer)
DDL compiler generates a set of tables stored in a data dictionary (数据字典)
Data dictionary contains metadata (元数据)
Database schema
Data storage and definition language
specifies the storage structure and access methods (存储结构和访问方法)
Integrity constraints
domain constraints (类型约束)
referential integrity (参考完整性)
assertions (声明约束)
Authorization (权限)
SQL (结构化查询语言)
Structural Query Language
Principal language to describe and manipulate relational
databases
SQL-99 standard
Two aspects:
Data-Definition language (DDL) for declaring database schemas
Data-Manipulation language (DML) for querying and modifying
SQL: 3 kinds of relations
tables (表):relations stored in the database allowing
modification as well queries
views (视图): relations defined by a computation,
constructed when needed
temporary tables (临时表): constructed by SQL
processor when performing queries.
Relational Algebra
-- Operations on relations, construct new relations from old
ones
Four classes of operations
set operations: union, intersection, and difference
removing: “selection” eliminates some rows (tuples), and
“projection” eliminates some columns
relation combination: Cartesian product, join
renaming: changing the relation schema, i.e., the names of
the attributes and/or the name of relation itself.
Outlines
Procedural language (过程性语言)
Six basic operators (基本操作)
Select: σ (选择)
Project: Π (投影)
Union: ᴜ (并)
Set difference: - (差)
Cartesian product: × (笛卡尔积)
rename: ρ (重命名)
The operators take one or two relations as inputs and
produce a new relation as a result
Select Operator (选择)
Relation r
σ
(A=B)Ʌ(D>5)
(r)
select operator
Notation: σ p (r)
p is called the selection predicate (选择谓词)
Defined as
p (r) t | t r and p(t)
p is a formula in propositional calculus (命题演算) consisting of
terms connected by: ˄(and), ˅(or), ¬(not)
Each term can be one of
<attribute> op (<attribute> or <constant>)
where op can be: =, ≠, ≤, ≥, >, <
Example of selection:
σ
name=“Charlie” (account)
Project operator: example
Relation r:
ΠA,C(r):
Project operator (投影)
Notation: ΠA1,A2,…, Ak(r)
A1, …, Ak are the attribute names and r is a relation
name
The result is defined as the relation of k columns
obtained by erasing the columns that are not selected
Duplicate rows are removed from result, since relation
are sets
Example: To eliminate the sname attribute of relation
exam:
Πsid, score(exam)
union operator: example
relation r, s:
r ᴜ s:
union operator
Notation: r ᴜ s
Defined as:
r s {t | t r or t s}
For union operation to be valid,
r, s must have the same arity (same number of attributes)
the attribute domains must be compatible (each column of
the two operands should have the same data type)
Example: find all students with A or Not pass
set difference operator
Relation r, s
r–s
set difference operator
Notation: r – s
Defined as:
r s {t | t r and t s}
Set differences must be taken between compatible relations
r and s must have the same arity
attribute domains of r and s must be compatible
Cartesian product - example
Relation r, s:
r × s:
Cartesian product
Notation: r ×s
Defined as:
r s {t q | t r and q s}
Assume that attributes of r(R) and s(S) are disjoint. That is, R
ɅS=ø
If attributes of r(R) and s(S) are not disjoint, then
rename must be used.
combination of operations
a single expression can contain multiple operations
Example:
r×s
σA=C(r × s)
Renaming operator
Renaming operator allows us to name, and therefore to refer to
the results of relational-algebra expression (关系代数表达式)
Allows us to refer to a relation by more than one name.
Example:
ρX(E) returns the expression E under the name X
If a relational algebra expression E has arity n, then
ρX(A1,A2,…, An)(E)
returns the result of expression E under the name X, and with the
attributes renamed to A1, A2, …, An
Banking example
branch(branch_name, branch_city, assets)
customer(customer_name, customer_street, customer_city)
account(account_number, branch_name, balance)
loan(loan_number, branch_name, amount)
depositor(customer_name, account_number)
borrower(customer_name, loan_number)
Examples
Find loans over $1200:
σamount>1200(loan)
Find the loan numbers for loans with amount over 1200:
o Πloan_number(σamount>1200(loan))
Find the customer names for all who have a loan, an account,
or both, from the bank:
Πcustomer_name(borrower) ᴜ Πcustomer_name(depositor)
Example queries
Find the names of all customers who have a loan at the
Shanghai branch:
Πcustomer_name(σbranch_name=“Shanghai”(σborrower.loan_numbe
r=loan.loan_number(borrower×loan)))
Find the names of all customers who have a loan at the
Shanghai branch but do not have an account at any branch of
the bank:
Πcustomer_name(σbranch_name=“Shanghai”(σborrower.loan_number=loan.loan_number(borr
ower×loan))) - Πcustomer_name(depositor)
Two different strategies
find the names of all customers who have a loan at the
Shanghai branch
(1)
Πcustomer_name(σbranch_name=“Shanghai”(σborrower.loan_number=l
oan.loan_number(borrower×loan)))
(2)
Πcustomer_name(σborrower.loan_number=loan.loan_number(σbranch_na
me=“Shanghai”(loan))×borrower))
Example queries
find the largest account balance
Strategy:
(1) Find those balances that are not the largest
(a) rename account as d so that we can compare each account
balance with all others
(2) use set difference to find those account balances that were
not found in the previous step
Πbalance(account) – Πaccount.balance(σaccount.balance <
d.balance(account ×ρd(account)))
Formal definition
A basic expression in the relational algebra consists of either one of the
following:
A relation in the database
A constant relation
Let E1 and E2 be relational-algebra expressions; the following are all
relational-algebra expressions:
E1∪E2 (并)
E1 – E2 (差)
E1 x E2 (笛卡尔积)
σp(E1), p is a predicate on attributes in E1 (选择)
Πs(E1), s is a list of some of the attributes in E1 (投影)
ρx(E1), X is the new name for the result of E1 (更名)
Additional operations
Set intersection (交)
Natural join (自然连接)
Division (除)
Assignment (赋值)
We must keep in mind that additional operations do NOT
add anything to relational algebra, but that simplify some
common queries
set intersection - example
Relation r, s
r∩s:
Set intersection
Notation: r∩s
Defined as:
r s {t | t r and r s}
Assume:
r, s have the same arity
attributes of r and s are compatible
Note: r∩s = r – (r-s)
Natural join
Notation:
r s
Let r and s be relations on schemas R and S, respectively. Then
obtained
s
is a relation on schema Rr ∪S
as follows:
Consider each pair of tuples tr from r and ts from s.
If tr and ts have the same value on each of the attributes in R∩S, add a
tuple t to the result, where t has the same value as tr on r and t has
the same value as ts on s
Example:
R= (A, B, C, D)
S= (E, B, D)
Result schema = (A, B, C, D, E)
Natural join - example
Relation r, s:
r :s
Division operator (除法)
Notation: r ÷s
Suitable for queries that include the phrase “for all”
Let r and s be relations on schemas R and S,
where
R = (A1, …, Am, B1, …, Bn)
S = (B1, …, Bn)
The result of r ÷s is a relation on schema R – S =
(A1, A2, …, Am)
r s {t | t R S (r) u s(tu r)}
where tu means the concatenation of tuples t and u to
produce a single tuple
Division example 1
r ÷s:
Division example 2
r ÷s:
Assignment operator
The assignment operator (←) provides a convenient way to
expression complex queries
write query as a sequential program consisting of a series of
assignments
followed by an expression whose value is displayed as a result of
the query
Assignment must always be made to a temporary relation
variable
Example: write r÷s as
temp1 ← ΠR-S(r)
temp2 ← ΠR-S((temp1×s) - ΠR-S,S(r)
result = temp1 – temp2