Transcript FROM Movie

Introduction to SQL
Select-From-Where Statements
Subqueries
Grouping and Aggregation
1
Why SQL?
SQL is a very-high-level language.
 Say “what to do” rather than “how to do it.”
 Avoid a lot of data-manipulation details needed
in procedural languages like C++ or Java.
Database management system figures out
“best” way to execute query.
 Called “query optimization.”
2
Select-From-Where Statements
SELECT desired attributes
FROM one or more tables
WHERE condition about tuples of the tables;
For example:
SELECT title, length
FROM movies
WHERE year =1994;
3
Our Running Example
All our SQL queries will be based on the following
database schema:
Movie(title, year, length, inColor, studioName,
producerC#)
 StarsIn(movieTitle, movieYear, starName)
 MovieStar(name, address, gender, birthdate)
 MovieExec(name, address, cert#, netWorth)
 Studio(name, address, preaC#)
4
Example
Using Movie, what movies were produced by
Disney Studios in 1990?
SELECT title, length
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;
Notice SQL uses single-quotes for strings.
SQL is case-insensitive, except inside strings.
5
Result of Query
title
length
Pretty Women
…
119
6
Meaning of Single-Relation Query
Begin with the relation in the FROM clause.
Apply the selection indicated by the WHERE
clause.
Apply the extended projection indicated by
the SELECT clause.
7
Operational Semantics
To implement this algorithm think of a tuple
variable ranging over each tuple of the
relation mentioned in FROM.
Check if the “current” tuple satisfies the
WHERE clause.
If so, compute the attributes or expressions of
the SELECT clause using the components of
this tuple.
8
* In SELECT clauses
When there is one relation in the FROM clause,
* in the SELECT clause stands for “all attributes
of this relation.”
Example using Movie:
SELECT *
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;
9
Result of Query:
title
year length inColor
Pretty
1990 119
Women
…
true
studioName procucerC#
Disney
999
10
Renaming Attributes
If you want the result to have different attribute names,
use “AS <new name>” to rename an attribute.
Example based on Movie:
SELECT title As name, length As duration
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;
11
Result of Query:
name
Duration
Pretty Women
…
119
12
Expressions in SELECT Clauses
Any expression that makes sense can appear as an
element of a SELECT clause.
Example: from Movie:
SELECT title As name, length*0.016667 As lengthInHours
FROM Movie;
13
Constant Expressions
From Movie :
SELECT title, length*0.016667 AS length, ‘hrs.’ AS
inHours
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;
14
Result of Query
title
length
inHours
Pretty Women
…
1.98334
hrs.
15
Complex Conditions in WHERE
Clause
From Movie, find all the movies made after
1970 that are in black-and-white:
SELECT title
FROM Movie
WHERE year > 1970 AND NOT inColor;
16
Patterns
WHERE clauses can have conditions in which
a string is compared with a pattern, to see if it
matches.
General form:
 <Attribute> LIKE <pattern> or
 <Attribute> NOT LIKE <pattern>
Pattern is a quoted string with % (any string)
or _ (any character)
17
Example
From Movie find all the movies have the title
as “Star something”, and we remember that
something has four letters.:
A ‘_’ just represent
half Chinese
characters
SELECT title
FROM Movie
WHERE title LIKE ‘Star _ _ _ _’;
18
Escape characters in LIKE
expressions
If the pattern we wish to use in a LIKE
expression involves the characters % or _.
We can follow the pattern by the keyword
ESCAPE and choose a escape character.
For example:
a LIKE ‘x%%x%’ ESCAPE ‘x’
19
Comparing dates and times
A date is represented by the keyword DATE
followed by a quoted string of a special form. For
example, DATE ‘1948-05-14’ follows the required
form.
A time is represented similarly by the keyword
TIME and a quoted string. For instance, TIME ‘
15:00:02.5’
We can compare dates or times using the same
comparison operators we use for numbers or
strings.
20
Ordering the output
To get output in sorted order, we add to the
select-from-where statement a clause:
ORDER BY <list of attributes>
The order is by default ascending (ASC),
but we can get the output highest-first by
appending the keyword DESC.
21
Example
To get the movies listed by length, shortest first, and
among movies of equal length, alphabetically, we can
say:
SELECT *
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990
ORDER BY length, title;
22
练习
图书(书号,书名,作者,出版社,单价)







查询“数据库”一书的书号和单价
查询所有图书的名称和单价,并按单价从大到小排序
查询单价在20至50元之间的图书信息
查询北京某出版社出版的图书信息
查询作者是张一,王二,刘三的书的信息
查询所有图书的书号,书名和半价信息
查询缺少出版社信息的图书的书号和书名
23
Multirelation Queries
Interesting queries often combine data from
more than one relation.
We can address several relations in one
query by listing them all in the FROM
clause.
Distinguish attributes of the same name by
“<relation>.<attribute>”
24
Example
Using relations Movie and MovieExec, find the
name of the producer of Star War.
SELECT name
FROM Movie, MovieExec
WHERE title = ‘Star War’ AND producerC# = cert#;
25
Formal Semantics
 Almost the same as for single-relation
queries:
1. Start with the product of all the relations in
the FROM clause.
2. Apply the selection condition from the
WHERE clause.
3. Project onto the list of attributes and
expressions in the SELECT clause.
26
Operational Semantics
Imagine one tuple-variable for each relation
in the FROM clause.
 These tuple-variables visit each combination of
tuples, one from each relation.
If the tuple-variables are pointing to tuples
that satisfy the WHERE clause, send these
tuples to the SELECT clause.
27
Explicit Tuple-Variables
Sometimes, a query needs to use two copies
of the same relation.
Distinguish copies by following the relation
name by the name of a tuple-variable, in the
FROM clause.
It’s always an option to rename relations
this way, even when not essential.
28
Example
From MovieStar, find all pairs of stars who
share an address.
SELECT Star1.name, Star2.name
FROM MovieStar AS Star1, MovieStar AS Star2
WHERE Star1.address = Star2.address AND
Star1.name <> Star2.name;
29
练习
读者(读者编号,姓名,电话)
图书(书号,书名,作者,出版社,单价)
借阅(书号,读者编号,借阅日期)
 查询借阅过书号为‘J0004’图书的读者姓名
 查询王明所借阅的所有图书的书名和借阅日期
30
Subqueries
A parenthesized SELECT-FROM-WHERE
statement (subquery ) can be used as a value
in a number of places, including FROM and
WHERE clauses.
Example: in place of a relation in the
FROM clause, we can place another query,
and then query its result.
 Better use a tuple-variable to name tuples of the
result.
31
Subqueries That Return One Tuple
If a subquery is guaranteed to produce one
tuple, then the subquery can be used as a
value.
 Usually, the tuple has one component.
 A run-time error occurs if there is no tuple or
more than one tuple.
32
Example
 From Movie, MovieExec, find the name of
the producer of Star War.
 Two queries would surely work:
1. Find the certificate number for the producer of
Star War.
2. Find the name of the person with this certificate.
33
Query + Subquery Solution
SELECT name
FROM MovieExec
WHERE cert# =
(SELECT producerC#
FROM Movie
WHERE title = ‘Star Wars’);
34
The IN Operator
<tuple> IN <relation> is true if and only if
the tuple is a member of the relation.
 <tuple> NOT IN <relation> means the
opposite.
IN-expressions can appear in WHERE
clauses.
The <relation> is often a subquery.
35
Example
 From Movie, MovieExec and StarsIn, find all the producers of
movies in which Harrison Ford stars.
SELECT name
FROM MovieExec
WHERE cert# IN
(SELECT producerC#
FROM Movie
WHERE (title, year) IN
(SELECT movieTitle, movieYear
FROM StarsIn
WHERE starName = ‘Harrison Ford’));
The nested query can be written as a
single select-from-where expression?
36
The Exists Operator
EXISTS( <relation> ) is true if and only if
the <relation> is not empty.
Example: From Beers(name, manf) , find
those beers that are the unique beer by their
manufacturer.
37
Example Query with EXISTS
SELECT name
FROM Beers b1
WHERE NOT EXISTS(
SELECT *
FROM Beers
WHERE manf = b1.manf AND name <> b1.name);
38
The Operator ANY
x = ANY( <relation> ) is a boolean condition
true if x equals at least one tuple in the relation.
Similarly, = can be replaced by any of the
comparison operators.
Example: x >= ANY( <relation> ) means x is not
the smallest tuple in the relation.
 Note tuples must have one component only.
39
The Operator ALL
Similarly, x <> ALL( <relation> ) is true if
and only if for every tuple t in the relation,
x is not equal to t.
 That is, x is not a member of the relation.
The <> can be replaced by any comparison
operator.
Example: x >= ALL( <relation> ) means
there is no tuple larger than x in the
relation.
40
Example
From Movie, find the titles that have been used for
two or more movies.
SELECT title
FROM Movie As old
WHERE year < ANY(
SELECT year
FROM Movie
WHERE title = old.title);
41
Union, Intersection, and Difference
Union, intersection, and difference of
relations are expressed by the following
forms, each involving subqueries:
 ( subquery ) UNION ( subquery )
 ( subquery ) INTERSECT ( subquery )
 ( subquery ) EXCEPT ( subquery )
42
Example
 Using MovieStar and MovieExec, suppose we
wanted the names and addresses of all female
movie stars who are also movie executives
with a net worth over $10,000,000.
(SELECT name, address
FROM MovieStar
WHERE gender = ‘F’)
INTERSECT
(SELECT name, address
FROM MovieExec
WHERE netWorth > 10000000);
43
Controlling Duplicate Elimination
Force the result to be a set by SELECT
DISTINCT . . .
44
Example: DISTINCT
 From Movie, MovieExec, StarIn, find all the producers of
movies in which Harrison Ford stars :
SELECT DISTINCT name
FROM Movie, MovieExec, StarsIn
WHERE cert# = producerC# AND
tile = movieTitle AND
year = moiveYear AND
starName = ‘Harrison Ford’;
 Notice that without DISTINCT, each name would be listed
many times.
45
Aggregations
SUM, AVG, COUNT, MIN, and MAX can
be applied to a column in a SELECT clause
to produce that aggregation on the column.
Also, COUNT(*) counts the number of
tuples.
46
Example: Aggregation
From MovieExec, find the average net
worth of all movie executives:
SELECT AVG(netWorth)
FROM MovieExec;
47
Eliminating Duplicates in an
Aggregation
Use DISTINCT inside an aggregation.
Example: find the number of different name
in MovieExec:
SELECT COUNT(DISTINCT name)
FROM MovieExec;
48
NULL’s Ignored in Aggregation
NULL never contributes to a sum, average,
or count, and can never be the minimum or
maximum of a column.
But if there are no non-NULL values in a
column, then the result of the aggregation is
NULL.
49
Example: Effect of NULL’s
SELECT count(*)
FROM MovieExec;
The number of tuples
in MovieExec.
SELECT count(name)
FROM MovieExec;
The number of tuples
that name is not NULL
in MovieExec.
50
Grouping
We may follow a SELECT-FROMWHERE expression by GROUP BY and a
list of attributes.
The relation that results from the SELECTFROM-WHERE is grouped according to
the values of all those attributes, and any
aggregation is applied only within each
group.
51
Example: Grouping
From Movie, the sum of the lengths of all
movies for each studio is expressed by:
SELECT studioName, SUM(length)
FROM Movie
GROUP BY studioName;
52
Example: Grouping
From Movie and MovieExec, find each
producer’s total length of film produced.
SELECT name, SUM(length)
FROM MovieExec, Movie
WHERE producerC# = cert#
GROUP BY name;
53
Restriction on SELECT Lists
With Aggregation
 If any aggregation is used, then each
element of the SELECT list must be
either:
1. Aggregated, or
2. An attribute on the GROUP BY list.
54
Illegal Query Example
You might think you could find the movie
that is the longest in length:
SELECT title, MAX(length)
FROM Movie
But this query is illegal in SQL.
55
HAVING Clauses
HAVING <condition> may follow a
GROUP BY clause.
If so, the condition applies to each group,
and groups not satisfying the condition are
eliminated.
56
Example: HAVING
From Movie and MovieExec, find the total film
length for only those producers who made at least
one film prior to 1930.
SELECT name, SUM(length)
FROM MovieExec, Movie
WHERE producerC# = cert#
GROUP BY name
HAVING MIN(year) < 1930;
57
Requirements on HAVING
Conditions
 These conditions may refer to any relation or
tuple-variable in the FROM clause.
 They may refer to attributes of those
relations, as long as the attribute makes sense
within a group; i.e., it is either:
1. A grouping attribute, or
2. Aggregated.
58
练习
读者(读者编号,姓名,电话)
图书(书号,书名,作者,出版社,单价)
借阅(书号,读者编号,借阅日期)
 查询借阅过图书的读者姓名
 查询图书总数
 查询每天的图书借阅量(图书借阅记录数)
59