SQL 1 - New Paltz

Download Report

Transcript SQL 1 - New Paltz

SQL 1
Database Principles
Connecting to db2:
• Using your userID and password connect to our server.
ssh –l userID avalon.cs.newpaltz.edu
• Connect to the Library database
$ db2 connect to Library
$ db2 “set schema = db2inst1”
OS process
db2 instance
Library
userID1
userID2
each schema has tables
created by the schema owner
manages various databases
Sample
...
each database has one schema
per user
db2inst1
Cardholder
Database Principles
Book
...
Once Connected:
• db2inst1 schema has Library tables already created.
• Viewing the tables:
select * from Cardholder
select * from Book
select * from Copy
select * from Reserves
select * from Borrows
select * from Status
Database Principles
* means “all columns”
Two Kinds of Interaction:
• Command-line:
$ db2 “select * from Cardholder”
• Execute a file:
– place your query in a file
# query1.sql
end all commands with a ;
select * from Cardholder;
– execute the file
$ db2 –tf query1.sql
• Latter is better:
– easier to edit
– don’t have to type same command twice.
Database Principles
Split Personality
• SQL is made up of two parts
– DDL (Data Definition Language)
– DML (Data Manipulation Language)
• DDL:
– Mostly “create” statements: create table, create index,
create view, drop table, …
• DML: Made up of 4 commands: insert, delete, update
and select
Database Principles
DDL vs DML:
• The nature of DDL commands is that
– they are used infrequently,
– have syntax that does not vary and
– can always be looked up in a book.
• The DML commands tend to
– be unique to the situation,
– contain very complicated logic and
– require great effort to become proficient.
Database Principles
DDL:
• I have nothing more to say.
• If you want to be good at it, read a book
Database Principles
DML:
• Four commands – insert, delete, update and select –
that do the obvious.
•
•
•
•
Insert: Add new rows to a table
Delete: Remove rows from a table
Update: Modify columns in certain table rows
Select: Retrieve information from database tables.
• All four commands share the property that the
programming problem they present is the logic of
expressing what rows to add, drop, modify or retrieve.
Database Principles
Basic Syntax (db2):
• Insert Statement:
Insert into <table name> (<list of columns>)
values (<matching list of values>)
or
Insert into <table name> (<list of columns>)
select_statement (see def’n later)
• Delete Statement:
Delete from <table name>
where <condition describing
rows to be deleted>
Database Principles
Basic Syntax:
• Update Statement:
update <table name>
set column name = new value,
...
where <condition describing
rows to be updated>
• Select Statement:
Select <list of output columns>
from <list of input tables>
where <condition describing
rows to be retrieved>
NOTE 1: Other versions of SQL
such as T-SQL allow the update
and delete statements to contain
a from_clause. db2 does not.
NOTE 2: When the <list of output
columns> is * this means “all columns
in all tables in the from_clause”
Database Principles
Shared Property and Our Focus:
• The shared property that these four commands share is
a where_clause condition.
• It turns out that learning to write the where_clause
condition correctly is what it is all about.
• We shall spend our time learning to write where_clause
conditions in the context of select statements.
Database Principles
Procedural vs Declarative Languages:
• Most programming languages are procedural; that is the
programmer must write down a sequence of instructions
that are to be executed when the program is run.
• Java, C++, Perl and Php are all procedural languages.
• Using a procedural language, the programmer figures
out what needs to be done and then writes out an
algorithm for how to accomplish the desired task.
• Defining what is primarily a design activity.
• Defining how is primarily a programming activity.
Database Principles
SQL is a Declarative Language
• With SQL you focus on saying what not how.
• This sounds simple.
• Easier to say “write a program that X” than to actually
write the program.
• However, expressing “what” in terms of precise logic
takes some training and that is what we will do in this
section of the course
Database Principles
SQL SELECT:
• Many of the operations of Relational Algebra – select,
project, join, set difference and quotient – are all
performed in SQL with the SELECT command.
• Instead of putting together a sequence of commands in
Relational Algebra we ask one query in SQL.
• This one query can contain all the logic of several
Relational Algebra queries.
Database Principles
Hard to be Complete and Concise:
• In the book, “The River Runs Through”, the author,
Norman MacLean, as a young boy writes a page
describing something.
• His father says, “That’s fine, now reduce all you have
said to one paragraph and don’t leave anything out.”
• When the boy returns with the result his father says,
“That’s fine, now reduce all you have said to one
sentence and don’t leave anything out.”
• Being complete and concise at the same time is hard!
Database Principles
SELECT Syntax:
joins and
conditions
final project
Select <select_list>
from <table_list>
where <condition>
input tables
• select_list: A list of columns/expressions that make up
the output of the query.
• table_list: A list of tables that make up the input of the
query.
• condition: A boolean expression (sometimes very
complex) that must be true about rows in the various
tables in the table_list in order to produce an output
looking like the select_list.
Database Principles
Select Example 1:
Select b_name, b_addr from Cardholder
• This query gives a complete list of all cardholder names
and addresses.
• Things to note:
– The select_list acts like the final project in a Relational
Algebra query
– The from_clause only contains one table
– There is no where_clause condition so the
where_clause is TRUE for each row of Cardholder.
Database Principles
How to Imagine the Query being Executed
• Imagine there is a row-variable, ch, that visits each row
of the table.
• Each time it stops at a row the where_clause is
evaluated.
• Each time the where_clause is TRUE, values are
extracted from the row and sent to output.
ch
Database Principles
Select Example 2:
Acts like Relational Algebra PROJECT
Select b_name, b_addr from Cardholder
where b_status = ‘junior’
Acts like Relational Algebra SELECT
where_clause
false
false
true
false
true
false
false
ch
ch
ch
ch
ch
ch
ch
Answer
b_name
b_addr
jo-ann
john
New Paltz
Kingston
Database Principles
NOTE: The condition
b_status = ‘junior’
is called a SARG
(Search ARGument)
Select Example 3:
• Find the cardholders (b_name, b_addr) with
ch and b are called
Reservations on December 1.
table aliases
Select b_name, b_addr from cardholder ch, borrows b
where ch.borrowerid = b.borrowerid and
l_date = ‘Dec 1’
called a join-term
b
ch
ch
b
b_name
b_addr
john
susan
New Paltz
Wallkill
Database Principles
Imagining a Multi-table Query
Select b_name, b_addr from cardholder ch, borrows b
where ch.borrowerid = b.borrowerid and
l_date = ‘Dec 1’
ch
b
for each row in Cardholder (ch)
for each row in Borrows (b)
if ( ch.borrowerid = b.borrowerid and
b.l_date = ‘Dec 1’) then
copy columns in select_list to output
Database Principles
Alternative Syntax:
project
join
Select b_name, b_addr from cardholder ch, borrows b
where ch.borrowerid = b.borrowerid and
l_date = ‘Dec 1’
select
Can also be written
NOTE: In terms of Relational
Algebra both these queries
combine select, project and
join
Select b_name, b_addr
from cardholder ch inner join borrows b
on ch.borrowerid = b.borrowerid
where l_date = ‘Dec 1’
By this logic the join terms are removed from the
where_clause and placed inside the from_clause.
Database Principles
More Complicated Join Query:
• Find the cardholders (b_name, b_addr) who have
reserved a book that was borrowed on Dec 1. Include in
your answer the book (author, title) involved.
Select b_name, b_addr, author, title
from cardholder ch, reserves r, book k, copy c, borrows b
where ch.borrowerid = r.borrowerid AND
r.ISBN = k.ISBN AND
k.ISBN = c.ISBN AND
c.accession_no = b.accession_no AND
b.l_date = ‘Dec 1’

unique answers
Select distinct b_name, b_addr, author, title
from cardholder ch, reserves r, book k, copy c, borrows b
where ch.borrowerid = r.borrowerid AND
r.ISBN = k.ISBN AND
k.ISBN = c.ISBN AND
c.accession_no = b.accession_no AND
b.l_date = ‘Dec 1’
Database Principles
# cardholder reserved something
# that is a book
# which has a copy
# that was loaned
# on Dec 1
Subqueries:
• It is possible to nest one query inside another. This is
called a subquery.
• Subqueries act somewhat like subordinate clauses in an
English sentence.
Who reserved the books that John from New Paltz borrowed?
• The query strategy can be
– First find all books borrowed by John from New Paltz
– Then find who reserved “those” books.
Database Principles
Subqueries 2:
• Since the results of subqueries are sets then we need
special set operators to manipulate them.
• The special boolean operators are IN and EXISTS
• IN (tests set membership):
element IN (select … )
tests if the thing on the left (element) belongs to the set
which is the result of the subquery.
• EXISTS (tests set existence):
EXISTS (select … )
tests if the subquery has an empty result set or not.
Database Principles
Subquery Example (cont):
Who reserved the books that John from New Paltz borrowed?
Select distinct b_name,
b_addrb_name, b_addr
Select distinct
from cardholderfrom
ch, reserves
r ch, reserves r,
cardholder
where ch.borroweridcopy
= r.borrowerid
c, borrowsAND
b, cardholder ch1
r.ISBN IN where ch.borrowerid = r.borrowerid AND
(select c.ISBN
r.ISBN = c.ISBN AND
from copy c, borrows b,
cardholder ch1
c.accession_no
= b.accession_no AND
where c.accession_nob.borrowerid
= b.accession_no
AND
= ch1.borrowerid
AND
b.borrowerid = ch1.borrowerid
AND AND
ch1.b_name = ‘John’
ch1.b_name = ‘John’
AND = ‘New Paltz’
ch1.b_addr
ch1.b_addr = ‘New Paltz’)
Exercise: Rewrite this query without using a subquery.
Database Principles
UNION Query:
• Find the cardholders (b_name, b_addr) who have
borrowed or reserved a book.
(select distinct b_name, b_addr
from cardholder ch, borrows b
where ch.borrowerid = b.borrowerid)
UNION
(select distinct b_name, b_addr
from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid)
NOTE: There is a possibility of duplicate rows in the answer for a couple
of reasons
- a cardholder may reserve or borrow more than one book
- a cardholder may reserve one book and borrow another
The key word “distinct” makes sure the output of the subqueries are
distinct and the key word “UNION” makes sure that duplicate answers
across the subqueries are not displayed.
To see all duplicates drop the “distinct” key word and replace “UNION”
with “UNION ALL”
Database Principles
Query Exercise:
• Rewrite the previous query without using UNION.
• Find the cardholders (b_name, b_addr) who have
borrowed or reserved a book.
select distinct b_name, b_addr
from cardholder ch, borrows b, reserves r
where ch.borrowerid = b.borrowerid or
ch.borrowerid = r.borrowerid)
Database Principles
INTERSECT Query:
• Find the cardholders (b_name, b_addr) who have both
borrowed and reserved books
(select distinct b_name, b_addr
from cardholder ch, borrows b
where ch.borrowerid = b.borrowerid)
INTERSECT
(select distinct b_name, b_addr
from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid)
Database Principles
SET DIFFERENCE Query:
• Find the cardholders (b_name, b_addr) who have
borrowed a book but not reserved any books
(select distinct b_name, b_addr
from cardholder ch, borrows b
where ch.borrowerid = b.borrowerid)
MINUS
(select distinct b_name, b_addr
from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid)
Database Principles
Hard Queries:
• Like RA, hard queries signaled by key words: only, all,
not, none and so forth.
• Like RA, hard queries resolved using techniques other
than select, project and join.
• Using the MINUS operator as in our previous example,
we can implement set differences.
• We can also implement set differences using NOT IN.
element NOT IN (select …)
Database Principles
NOT IN Queries:
• Advantage of the NOT IN syntax: it reads like English.
Find cardholders (b_name, b_addr) who have only reserved books on Dec 1
• Rewrite this same query removing the word “only” and
using the phrase “not in” instead.
Find cardholders (b_name, b_addr) who have reserved a book and
are not in the set of cardholders who have reserved books on a date
other than Dec 1
BookReservers
Reservers of Books
only on Dec 1
Reservers of Books on a
date other than Dec 1
Database Principles
English-to-SQL Translation:
• The latest version of our query translated easily into SQL
Find cardholders (b_name, b_addr) who have reserved a book and
those cardholders are not in the set of cardholders who have reserved books
on a date other than Dec 1
Select b_name, b_addr
from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid AND
ch.borrowerid NOT IN
(select borrowerid from reserves
where r_date != ‘Dec 1’)
Exercise: Rewrite this query using the MINUS operator
Database Principles
Using Minus

Find cardholders (b_name, b_addr) who have only
reserved books on Dec 1
(select b_name, b_addr from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid and r_date = “Dec 1”)
MINUS
(select b_name, b_addr from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid and r_date != “Dec 1”)
Database Principles
Observation:
• Both the MINUS syntax and the NOT IN syntax allow the
programmer to program the query in steps:
– First find what you don’t want
– Throw it away
– You are left with what you want.
• Since one of these queries may run faster than the other
you will need to know how to write both.
• You get paid not only for correctness but probably more
important, speed.
Database Principles
Negation Queries:
• Like RA, some negation queries are “easy” in SQL and
some are “hard”.
• The former use SARGS and join_terms only.
• The latter usually need to use NOT IN.
• Like RA, the hard queries come from negating the
presence on an entity in some relationship where the
minimum participation number is 0.
Database Principles
Hard Negation Query Example 1
• Find the cardholders who have not borrowed anything.
• Rewritten as: Find the cardholders who are not in the set
of borrowers of books.
Select b_name, b_addr
from cardholder
where borrowerid NOT IN
(select borrowerid from borrows)
Database Principles
Hard Negation Query Example 2
• Find the cardholders who have not borrowed any book
they have reserved.
• Rewritten as: Find the cardholders who have reserved
something and are not in the set of borrowers of books
they have also reserved.
Select b_name, b_addr
from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid AND
ch.borrowerid NOT IN
(select b.borrowerid
from borrows b, copy c, reserves r
where b.borrowerid = r.borrowerid AND
b.accession_no = c.accession_no AND
c.ISBN = r.ISBN)
Database Principles
Hard Negation Query Example 3
• This query is called a co-related subquery.
• Find the cardholders who have not borrowed the same
book they have reserved.
• Rewritten as: Find the cardholders who have reserved at
least one book and that book not in the set of books they
have borrowed.
Select b_name, b_addr
from cardholder ch, reserves r
where ch.borrowerid = r.borrowerid AND
r.ISBN NOT IN
(select c.ISBN
from borrows b, copy c
where b.borrowerid = ch.borrowerid AND
b.accession_no = c.accession_no)
Database Principles
Co-related Subquery:
• What makes the subquery “co-related” is the presence of
a table alias inside the subquery that is defined outside
the subquery.
• This is very similar to using a global variable inside
function definition.
• It is easy to pick up because in the English version of the
query you will have a definite article (the, those, that) or
definite pronoun (they, she, he, it) inside the subquery
phrase that refers to something outside the subquery
phrase.
Database Principles
EXISTS:
• EXISTS can be used with a subquery to decide if it is
empty or not.
Find the cardholders who have borrowed a book
Find the cardholders where there exists a book loan
made by them
select b_name, b_addr
from cardholder ch
where exists (select * from borrows b
where b.borrowerid = ch.borrowerid)
• The logic is that if the subquery is empty this EXISTS is
false so nothing is returned.
Database Principles
Exercise:
• Write the previous query without using EXISTS.
select distinct b_name, b_addr
from cardholder ch, borrows b
where b.borrowerid = ch.borrowerid)
Database Principles
NOT EXISTS:
Find the cardholders who have reserved all AW books
• In RA we used the QUOTIENT operator for this query.
• In SQL we use NOT EXISTS
• Rewrite the query in English with NOT EXISTS
Find the cardholders where there does NOT EXIST a book
published by Addison-Wesley that they did not reserve.
Find the cardholders where there does NOT EXIST
a book published by Addison-Wesley and
that book is NOT IN the set of books they reserved.
Select b_name, b_addr from cardholder ch
where NOT EXISTS
(select * from book k
where k.pub_name = ‘AW’ AND
k.ISBN NOT IN (select ISBN from reserves r
where r.borrowerid = ch.borrowerid))
Database Principles