Transcript Document

SQL
• SQL is a very-high-level language, in which
the programmer is able to avoid specifying
a lot of data-manipulation details that would
be necessary in languages like C++.
• What makes SQL viable is that its queries
are “optimized” quite well, yielding
efficient query executions.
Select-From-Where Statements
• The principal form of a query is:
SELECT desired attributes
FROM one or more tables
WHERE condition about tuples of the tables
Our Running Example
• Our SQL queries will be based on the following database
schema.
Movie(title, year, length, inColor, studioName, producerC)
StarsIn(movieTitle, movieYear, starName)
MovieStar(name, address, gender, birthdate)
MovieExec(name, address, cert#, netWorth)
Studio(name, address, cert#, netWorth)
Find all movies produced by Disney Studios in 1990.
SELECT *
FROM Movie
WHERE studioName = 'Disney' AND year = 1990;
Meaning of Single-Relation Query
• Begin with the relation in the FROM
clause.
• Apply the selection indicated by the
WHERE clause.
• Apply the extended projection indicated by
the SELECT clause.
(Extended) Projection in SQL
SELECT title, length
FROM Movie
WHERE studioName = 'Disney' AND year = 1990;
SELECT title AS name, length AS duration
FROM Movie
WHERE studioName = 'Disney' AND year = 1990;
SELECT title AS name, length*0.016667 AS lenghtInHours
FROM Movie
WHERE studioName = 'Disney' AND year = 1990;
SELECT title AS name, length/60 AS length, 'hrs.' AS inHours
FROM Movie
WHERE studioName = 'Disney' AND year = 1990;
Selection in SQL
• The selection of the relational algebra is available through the
WHERE clause of SQL.
• We may build expressions by using the operators:
=
<>
<
>
<=
>=
• The string constants are surrounded by single quotes.
– studioName = 'Disney'
• Numeric constants are for e.g.: -12.34, 1.23E45
• Boolean operators are: AND, OR, NOT.
SELECT title
FROM Movie
WHERE (year > 1970) AND NOT (inColor='C');
Selection in SQL (Cont.)
• Which Disney movies are after 1970 or have length greater than
90 mins?
SELECT title
FROM Movie
WHERE (year > 1970 OR length < 90) AND studioName='Disney';
• The parenthesis are needed because the precedence of OR is less
than that of AND.
Comparision of strings
• Strings can as well be compared (lexicographically) with the
same operators:
=
<>
<
>
<=
>=
• For instance ‘fodder’<‘foo’ ‘bar’ < ‘bargain’
Patterns
•
WHERE clauses can have conditions in which a string is compared with a pattern, to
see if it matches.
• General form:
<Attribute> LIKE <pattern>
<Attribute> NOT LIKE <pattern>
• Pattern is a quoted string with
% = “any string”
_ = “any character.”
• Examples. Suppose we remember a movie “Star something”.
SELECT title
FROM Movie
WHERE title LIKE 'Star %';
SELECT title
FROM Movie
WHERE title LIKE '%''s%';
Two consecutive apostrophes in a string represent itself and not the end of the string.
Comparison of Strings (Continued)
• What if the pattern we wish to use in a LIKE expression
involves the characters % or _?
• We should “escape” their special meaning proceeding them by
some escape character.
• In UNIX and C we use backslash \ as the escape character.
• SQL allows us to use any character we like.
• s LIKE 'x%%x%%' ESCAPE 'x'
– x will be the escape character.
– A string that is matched by this pattern is for example: %aaaa%
SELECT title
FROM Movie
WHERE title LIKE 'x%%x%%' ESCAPE 'x';
Ordering the Input
• We may ask the tuples produced by a query to be presented in sorted
order.
• ORDER BY <list of attributes>
• Example. Find the Disney movies of 1990.
– Movie(title, year, length, inColor, studioName, producerC#)
– To get the movies listed by length, shortest first, and among movies of equal
length, sort alphabetically:
SELECT *
FROM Movie
WHERE studioName = 'Disney'
ORDER BY length, title;
• Ordering is ascending, unless you specify the DESC keyword to an
attribute.
• Ties are broken by the second attribute on the ORDER BY list, etc.
NULL Values
• Tuples in SQL relations can have NULL as a
value for one or more components.
• Meaning depends on context. Two common
cases:
– Missing value : e.g., we know the length has
some value, but we don’t know what it is.
– Inapplicable : e.g., the value of attribute spouse
for an unmarried person.
Comparing NULL’s to Values
• The logic of conditions in SQL is really 3valued logic: TRUE, FALSE, UNKNOWN.
• When any value is compared with NULL, the
truth value is UNKNOWN.
• But a query only produces a tuple in the
answer if its truth value for the WHERE
clause is TRUE (not FALSE or UNKNOWN).
Three-Valued Logic
• To understand how AND, OR, and NOT work in 3-valued
logic, think of
– TRUE = 1, FALSE = 0, and UNKNOWN = ½.
• AND = MIN
• OR = MAX
• NOT(x) = 1-x
• Example:
TRUE AND (FALSE OR NOT(UNKNOWN)) =
MIN(1, MAX(0, (1 - ½ ))) =
MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½.
Surprising Example
SELECT *
FROM Movie
WHERE length <=120 OR length > 120;
• Suppose that we have some NULL values in the length.
• What’s the result?
Products and Joins in SQL
• SQL has a simple way to couple relations in one query: list each relation
in the FROM clause.
– All the relations in the FROM clause are coupled through Cartesian product
– Then we can put conditions in the WHERE clause in order to get the
various kinds of join.
• Example. We want to know the name of the producer of Star Wars.
• To answer we need the information from both of the relations:
– Movie(title, year, length, inColor, studioName, producerC)
– MovieExec(name, address, cert, netWorth)
SELECT name
FROM Movie, MovieExec
WHERE title = 'Star Wars' AND producerC = cert;
Disambiguating Attributes
• When we involve two or more relations in a query, we can have
attributes with the same name among these relations.
• We solve the problem of disambiguating between them by putting the
name of the relation followed by a dot and then the name of the
attribute.
• Example. Suppose we wish to find pairs (star, movie executive) living in
the same address.
– MovieStar(name, address, gender, birthdate)
– MovieExec(name, address, cert, netWorth)
SELECT MovieStar.name, MovieExec.name
FROM MovieStar, MovieExec
WHERE MovieStar.address = MovieExec.address;
Tuple Variables
• Sometimes we need to ask a query that involves two or more tuples
from the same relation. We may list a relation R as many times we want
in the from clause but we need a way to refer to each occurrence of R.
• SQL allows us to define, for each occurrence in the FROM clause, an
alias which we call “tuple variable.”
• Example. We like to know about two stars who share an address.
SELECT Star1.name, Star2.name
FROM MovieStar AS Star1, MovieStar AS Star2
WHERE
Star1.address = Star2.address
AND Star1.name < Star2.name;
AS is not supported in Oracle.
In Oracle
SELECT Star1.name, Star2.name
FROM MovieStar Star1, MovieStar Star2
WHERE
Star1.address = Star2.address
AND Star1.name <> Star2.name;
Tuple Variables (Continued)
• Why we have the condition
– Star1.name < Star2.name ?
• Without this condition we would produce also pairs of identical star
names.
• This conditions forces us to produce each pair of stars with a common
address only.
• Why we used the operator < and not <>?
• If we had used <> the we would have produced pairs of married stars
twice, like:
Star1.name
Star2.name
Alec Baldwin
Kim Basinger
Kim Basinger
Alec Baldwin
Conversion to Relational Algebra
• Another approach to interpret SQL queries is to relate them
to relational algebra.
• Start with the relations in the FROM clause and take their
Cartesian Product.
• Having created the product, we apply a selection operator to
it by converting the WHERE clause to a selection
condition.
• Finally from the list of attributes in the SELECT clause we
do a projection.
An Unintuitive Consequence of SQL semantics
• Suppose R, S, T are unary relations each having attribute A alone.
• We wish to compute R(ST).
• We might expect the following SQL query to do the job.
SELECT R.A
FROM R, S, T
WHERE R.A = S.A OR R.A = T.A
• However, consider the situation in which T is empty. Since R.A = T.A can
never be satisfied, we might expect the query to produce exactly RT.
• But using the interpretation the result is empty.
– If we use the conversion to RA, the Cartesian product R x S x T is .
Union, Intersection, and Difference of Queries
• If two SQL queries produce relations with the same set of attributes then we can
combine the queries using the set operations: UNION, INTERSECT and
EXCEPT.
• Example. Suppose we want the names and addresses of all female movie stars
who are also movie executives with a net worth over $1,000,000.
– MovieStar(name, address, gender, birthdate)
– MovieExec(name, address, cert, netWorth)
(SELECT name, address
FROM MovieStar
WHERE gender = 'F')
INTERSECT
(SELECT name, address
FROM MovieExec
WHERE netWorth > 1000000);
Union, Intersection, and Difference of
Queries (Continued)
• Example. Give the names and addresses of movie stars who are not also
movie executives.
(SELECT name, address
FROM MovieStar)
EXCEPT
(SELECT name, address
FROM MovieExec);
• In ORACLE the EXCEPT is MINUS.
• Example. We want all the titles and years of movies that appeared in
either the Movie or StarsIn relation.
(SELECT title, year FROM Movie)
UNION
(SELECT title, year FROM StarsIn);