Transcript CH06

Chapter 6
The Database Language SQL
Prof. Yin-Fu Huang
CSIE, NYUST
6.1 Simple Queries in SQL
 The portion of SQL that supports queries has capabilities very
close to that of relational algebra.
 ANSI SQL, SQL2, SQL-99, SQL:2003
 SQL-99 extended with object-relational features and a number
of other new capabilities
 SQL:2003 a collection of extensions to SQL-99
 SELECT, FROM, WHERE
 Example
(See Fig. 6.1)
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
“All movies produced by Disney Studios in 1990”
SELECT
*
FROM
Movies
WHERE
studioName=’Disney’ AND year=1990;
6.1.1 Projection in SQL
 Example
SELECT title, length
 Alias
SELECT title AS name, length AS duration
 An expression in place of an attribute
SELECT title AS name, length*0.016667 AS lengthInHours
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
 A constant as an expression
SELECT title AS name, length*0.016667 AS length,
‘hrs.’ AS inHours
6.1.2 Selection in SQL
 Example
SELECT title
FROM Movies
WHERE (year>1970 OR length<90) AND
studioName=’MGM’;
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
6.1.3 Comparison of Strings
 Lexicographic order
6.1.4 Pattern Matching in SQL
 s LIKE p where s is a string and p is a pattern
 % and _
 Example
SELECT title
FROM Movies
WHERE title LIKE ’Star ____’;
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
 Example
SELECT title
FROM Movies
WHERE title LIKE ’%’’s%’;
6.1.5 Dates and Times
 Example
DATE ’1948-05-14’
TIME ’15:00:02.5’
TIMESTAMP ’1948-05-14 12:00:00’
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
6.1.6 Null Values and Comparisons Involving NULL
 Different interpretations:
1) Value unknown
2) Value inapplicable
3) Value withheld
 Two important rules:
1) When we operate on a NULL and any value, including
another NULL, using an arithmetic operator like * or
+, the result is NULL.
2) When we compare a NULL value and any value,
including another NULL, using a comparison operator
like = or >, the result is UNKNOWN.
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
 We cannot use NULL explicitly as an operand.
 Example
NULL+3, NULL=3: not legal
 The correct way to ask if x has the value NULL is with the
expression x IS NULL.
6.1.7 The Truth-Value UNKNOWN
 The rule: TRUE as 1, FALSE as 0, UNKNOWN as 1/2
1) The AND of two truth-values is the minimum of those values.
2) The OR of two truth-values is the maximum of those values.
3) The negation of true-value v is 1-v.
(See Fig. 6.2)
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
 Example
SELECT *
FROM Movies
WHERE length<=120 OR length>120
The true meaning of the query is “find all the Movies tuples with
non-NULL lengths.”
6.1.8 Ordering the Output
 ORDER BY <list of attributes>
 DESC and ASC
Database Systems
Yin-Fu Huang
6.1 Simple Queries in SQL
 Example
SELECT *
FROM Movies
WHERE studioName=’Disney’ AND year=1990
ORDER BY length, title;
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
 Joins, products, unions, intersections, and differences
6.2.1 Products and Joins in SQL
 Example
“The name of the producer of Star Wars”
SELECT name
FROM Movies, MovieExec
WHERE title=’Star Wars’ AND producerC#=cert#;
(See Fig. 6.3)
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
6.2.2 Disambiguating Attributes
 R.A refers to the attribute A of relation R.
 Example
SELECT MovieStar.name, MovieExec.name
FROM MovieStar, MovieExec
WHERE MovieStar.address=MovieExec.address;
SELECT MovieExec.name
FROM Movies, MovieExec
WHERE Movies.title=’Star Wars’
AND Movies.producerC#=MovieExec.cert#;
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
6.2.3 Tuple Variables
 Tuple variable
 Example
SELECT Star1.name, Star2.name
FROM MovieStar Star1, MovieStar Star2
WHERE Star1.address= Star2.address
AND Star1.name<Star2.name;
6.2.4 Interpreting Multirelation Queries
 Several ways to define the meaning of the select-from-where
expressions
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
1) Nested Loops
(See Fig. 6.4)
2) Parallel Assignment
3) Conversion to Relational Algebra
Example
πA1, A5(σA2=A6 AND A1<A5(ρM(A1, A2, A3, A4)(MovieStar)×
ρN(A5, A6, A7, A8)(MovieStar)))
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
6.2.5 Union, Intersection, and Difference of Queries
 Example
“The names and addresses of all female movie stars who
are also movie executives with a net worth over
$10,000,000.”
(See Fig. 6.5)
 Example
(SELECT name, address FROM MovieStar)
EXCEPT
(SELECT name, address FROM MovieExec)
 If necessary to get a common set of attributes, we can rename
attributes.
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
Database Systems
Yin-Fu Huang
6.2 Queries Involving More than One Relation
 Example
(SELECT title, year FROM Movies)
UNION
(SELECT MovieTitle AS title, movieYear AS year
FROM StarsIn)
Database Systems
Yin-Fu Huang
6.3 Subqueries
 A query that is part of another is called a subquery.
 A number of ways that subqueries can be used:
1) Subqueries can return a single constant, and this constant
can be compared with another value in a WHERE clause.
2) Subqueries can return relations that can be used in
various way in WHERE clauses.
3) Subqueries can appear in FROM clauses, followed by a
tuple variable that represents the tuples in the result of
the subquery.
6.3.1 Subqueries that Produce Scalar Values
 We may compare the result of a subquery to a constant or
attribute.
Database Systems
Yin-Fu Huang
6.3 Subqueries
 Example
(See Fig. 6.6)
SELECT name
FROM MovieExec
WHERE cert#=12345
 If zero tuples or more than one tuple is produced by the subquery,
it is a run-time error.
6.3.2 Conditions Involving Relations
 A number of SQL operations that we can apply to a relation R
and produce a boolean result.
Database Systems
Yin-Fu Huang
6.3 Subqueries
Database Systems
Yin-Fu Huang
6.3 Subqueries
 s: a scalar value, R: a one-column relation
1) EXISTS R
2) s IN R, s NOT IN R
3) s > ALL R, s <> ALL R
4) s > ANY R, s = ANY R
6.3.3 Conditions Involving Tuples
 If a tuple t has the same number of components as a relation R,
then it makes sense to compare t and R in expressions.
 Example
“all the producers of movies in which Harrison Ford stars”
(See Fig. 6.7 and Fig. 6.8)
Database Systems
Yin-Fu Huang
6.3 Subqueries
Database Systems
Yin-Fu Huang
6.3 Subqueries
Database Systems
Yin-Fu Huang
6.3 Subqueries
 A single select-from-where expression: duplicates
(See Fig. 6.9)
6.3.4 Correlated Subqueries
 Correlated query: to be evaluated many times, once for each
assignment of a value to some term in the subquery that come
from a tuple variable outside the subquery.
 Example
“the titles that have been used for two or more movies”
(See Fig. 6.10)
 When writing a correlated query, it is important that we be aware
of the scoping rules for names.
Database Systems
Yin-Fu Huang
6.3 Subqueries
Database Systems
Yin-Fu Huang
6.3 Subqueries
Database Systems
Yin-Fu Huang
6.3 Subqueries
6.3.5 Subqueries in FROM Clauses
 Example
“the producers of Harrison Ford's movies”
(See Fig. 6.11)
6.3.6 SQL Join Expressions
 Cross join (Cartesian product)
e.g., Movies CROSS JOIN StarsIn;
 Join
 Example
Movie JOIN StarsIn ON
title=movieTitle AND year=movieYear;
Database Systems
Yin-Fu Huang
6.3 Subqueries
Database Systems
Yin-Fu Huang
6.3 Subqueries
 Using the whole expression as a subquery in a FROM clause and
using a SELECT clause to remove the undesired attributes.
SELECT title, year, length, genre, studioName, producerC#,
starName
FROM Movies JOIN StarsIn ON
title=movieTitle AND year=movieYear;
6.3.7 Natural Joins
 Differing from a theta-join
1) The join condition is that all pairs of attributes from the
two relations having a common name are equated.
2) One of each pair of equated attributes is projected out.
Database Systems
Yin-Fu Huang
6.3 Subqueries
 Example
MovieStar NATURAL JOIN MovieExec;
6.3.8 Outerjoins
 An outerjoin operator is a way to augment the result of a join by
the dangling tuples, padded with null values.
 Example
MovieStar NATURAL FULL OUTER JOIN MovieExec;
(See Fig. 6.12)
 NATURAL LEFT (RIGHT) OUTER JOIN
Database Systems
Yin-Fu Huang
6.3 Subqueries
Database Systems
Yin-Fu Huang
6.3 Subqueries
 Example
Movies FULL OUTER JOIN StarsIn ON
title=movieTitle AND year=movieYear;
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
 Relations that are bags
6.4.1 Eliminating Duplicates
 DISTINCT
 Example
(See Fig. 6.9)
SELECT DISTINCT name
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
6.4.2 Duplicates in Unions, Intersections, and Differences
 Example
(SELECT title, year FROM Movies)
UNION ALL
(SELECT MovieTitle AS title, movieYear AS year
FROM StarsIn)
 R INTERSECT ALL S
 R EXCEPT ALL S
6.4.3 Grouping and Aggregation in SQL
 SQL provides all the capability of the γ operator through the use
of aggregation operators in SELECT clauses and a special
GROUP BY clause.
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
6.4.4 Aggregation Operators
 SUM, AVG, MIN, MAX, COUNT
 Example
SELECT AVG(netWorth)
FROM MovieExec;
SELECT COUNT(*)
FROM StarsIn;
SELECT COUNT(DISTINCT starName)
FROM StarsIn;
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
6.4.5 Grouping
 GROUP BY
 Example
SELECT studioName, SUM(length)
FROM Movies
GROUP BY studioName;
 It is also possible to use a GROUP BY clause in a query about
several relations.
 Example
(See Fig. 6.13)
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
6.4.6 Grouping, Aggregation, and Nulls
 A few rules about Nulls:
1) The value NULL is ignored in any aggregation.
2) NULL is treated as an ordinary value when forming groups.
3) When we perform any aggregation except count over an
empty bag of values, the result is NULL. The count of an
empty bag is 0.
 Example
A
B SELECT A, COUNT(B)
SELECT A, SUM(B)
NULL NULL FROM R
FROM R
GROUP BY A;
GROUP BY A;
(NULL, 0)
(NULL, NULL)
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
6.4.7 HAVING Clauses
 We could restrict the tuples prior to grouping in a way that would
make undesired groups empty.
e.g., WHERE producerC#=cert# AND netWorth>=10000000
 However, sometimes we want to choose our groups based on
some aggregate property of the group itself.
 Example
(See Fig. 6.14)
 Several rules about HAVING clauses:
1) An aggregation in a HAVING clause applied only to the
tuples of the group being tested.
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
Database Systems
Yin-Fu Huang
6.4 Full-Relation Operations
2) Any attribute of relations in the FROM clause may be
aggregated in the HAVING clause, but only those
attributes that are in the GROUP BY list may appear
unaggregated in the HAVING clause.
Database Systems
Yin-Fu Huang
6.5 Database Modifications
 Insert tuples into a relation
 Delete certain tuples from a relation
 Update values of certain components of certain existing tuples
6.5.1 Insertion
 INSERT INTO R(A1, …, An) VALUES (v1, …, vn);
 The tuple created has default values for all missing attributes.
 Example
INSERT INTO StarsIn(movieTitle, movieYear, starName)
VALUES (‘The Maltese Falcon’, 1942, ‘Sydney Greenstreet’);
Database Systems
Yin-Fu Huang
6.5 Database Modifications
INSERT INTO StarsIn
VALUES (‘The Maltese Falcon’, 1942, ‘Sydney Greenstreet’);
The order of the values must be the same as the standard
order of attributes for the relation.
 Subquery
 Example
(See Fig. 6.15)
6.5.2 Deletion
 DELETE FROM R WHERE <condition>;
Database Systems
Yin-Fu Huang
6.5 Database Modifications
Database Systems
Yin-Fu Huang
6.5 Database Modifications
 Example
DELETE FROM StarsIn
WHERE movieTitle=’The Maltese Falcon’ AND
movieYear=1942 AND
StarName=’Sydney Greenstreet’;
DELETE FROM MovieExec
WHERE netWorth<10000000;
6.5.3 Updates
 UPDATE R SET <new-value assignments> WHERE <condition>;
Database Systems
Yin-Fu Huang
6.5 Database Modifications
 Example
UPDATE MovieExec
SET name=’Pres. ’ || name
WHERE cert# IN (SELECT presC# FROM Studio);
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
 Considering what can happen to leave the database in a state that
doesn’t reflect the operations performed on it.
6.6.1 Serializability
 What could go wrong if the DBMS were completely
unconstrained as to the order in which it operated upon the
database.
 Example
(See Fig. 6.16)
 The problem is solved in SQL by the notion of a “transaction.”
 Serializability: These transactions must behave as if they were
run serially – one at a time, with no overlap.
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
6.6.2 Atomicity
 It is possible for a single operation to put the database in an
unacceptable state if there is a hardware or software “crash”
while the operation is executing.
 Example
1. Add $100 to account 456
2. Subtract $100 from account 123
 Atomicity: Certain combinations of database operations need to
be done atomically; that is, either they are both done or neither is
done.
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
6.6.3 Transactions
 When using the generic SQL interface, each statement is a
transaction by itself.
 However, SQL allows the programmer to group several
statements into a single transaction.
 START TRANSACTION, COMMIT, ROLLBACK
6.6.4 Read-Only Transactions
 When a transaction only reads data and does not write data, we
have more freedom to let the transaction execute in parallel with
other transactions.
 However, read-only transactions would not be allowed to run in
parallel with a transaction that wrote the same data.
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
 SET TRANSACTION READ ONLY;
SET TRANSACTION READ WRITE; (default)
6.6.5 Dirty Reads
 Dirty data is a common term for data written by a transaction
that has not yet committed.
 The risk in reading dirty data is that the transaction that wrote it
may eventually abort.
 Sometimes the dirty read matters, and sometime it doesn’t.
 It may make sense to risk an occasional dirty read and thus avoid
1) the time-consuming work by the DBMS and 2) the loss of
parallelism.
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
 Example
1. Add money to account 2
2. Test if account 1 has enough money
a) No, remove the money from account 2
b) Yes, subtract the money from account 1
A1: $100
A2:$200
A3:$300
1. T2 executes step 1 and adds $250 to A3
2. T1 executes step 1 and adds $150 to A2
3. T2 executes the test of step 2 and finds A2 has enough funds
4. T1 executes the test of step 2 and finds A1 has not enough funds
5. T2 executes step 2b and subtracts $250 from A2
6. T1 executes step 2a and subtracts $150 from A2 (-$50)
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
 Example
1. Find an available seat and reserve it by setting ‘occupied’
2. Ask the custom for approval of the seat
a) Yes, commit
b) No, release the seat by setting ‘available’ and repeat
step 1
 It might well make sense to implement this seat-choosing function
in a way that allowed dirty reads, in order to speed up the average
processing time for booking requests.
SET TRANSACTION READ WRITE
ISOLATION LEVEL READ UNCOMMITTED;
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
6.6.6 Other Isolation Levels
 SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
(default)
 For READ COMMITTED, it allows a transaction to issue the same
query several times and get different answers.
 For REPEATABLE READ, it is possible that a second or subsequent
execution of the same query will retrieve phantom tuples that result
from insertions into the database while our transaction is executing.
 Example
the seat-choosing program
(See Fig. 6.17)
Database Systems
Yin-Fu Huang
6.6 Transactions in SQL
Database Systems
Yin-Fu Huang
The End.
Database Systems
Yin-Fu Huang