DBI 2008 - CS

Download Report

Transcript DBI 2008 - CS

JDBC (@PostgreSQL)
Java Database Connectivity
• JDBC (Java Database Connectiveity) is an API
(Application Programming Interface),
– That is, a collection of classes and interfaces
• JDBC is used for accessing (mainly) databases
from Java applications
• Information is transferred from relations to objects
and vice-versa
– databases optimized for searching/indexing
– objects optimized for engineering/flexibility
DBI 2008
HUJI-CS
2
JDBC Architecture
Network
These are
Java classes
Oracle
Driver
Oracle
Java
Application
DB2
Driver
JDBC
DB2
Postgres
Driver
We will
use this one…
Postgres
DBI 2008
HUJI-CS
3
JDBC Architecture (cont.)
Application
JDBC
Driver
• Java code calls JDBC library
• JDBC loads a driver
• The driver talks to a particular database
• An application can work with several databases
by using all corresponding drivers
• Ideal: change database engines w/o changing
any application code (not always in practice)
DBI 2008
HUJI-CS
4
Seven Steps
• Load the driver
• Define the connection URL
• Establish the connection
• Create a Statement object
• Execute a query using the Statement
• Process the result
• Close the connection
DBI 2008
HUJI-CS
5
Registering the Driver
• To use a specific driver, instantiate and
register it within the driver manager:
Driver driver = new
org.postgresql.Driver();
DriverManager.registerDriver(driver);
DBI 2008
HUJI-CS
6
A Modular Alternative
• We can register the driver indirectly using
Class.forName("org.postgresql.Driver");
• Class.forName loads the given class dynamically
• When the driver is loaded, it automatically
– creates an instance of itself
– registers this instance within DriverManager
• Hence, the driver class can be given as an
argument of the application
DBI 2008
HUJI-CS
7
An Example
// A driver for imaginary1
Class.forName("ORG.img.imgSQL1.imaginary1Driver");
// A driver for imaginary2
Driver driver = new ORG.img.imgSQL2.imaginary2Driver();
DriverManager.registerDriver(driver);
//A driver for PostgreSQL
Class.forName("org.postgresql.Driver");
imaginary1 imaginary2
Postgres
Registered Drivers
DBI 2008
HUJI-CS
8
Connecting to the Database
• Every database is identified by a URL
• Given a URL, DriverManager looks for the
driver that can talk to the corresponding
database
• DriverManager tries all registered drivers,
until a suitable one is found
• How is this done?
DBI 2008
HUJI-CS
9
Connecting to the Database
Connection con =
DriverManager.getConnection("jdbc:imaginaryDB1");
acceptsURL("jdbc:imaginaryDB1")?
a
r
imaginary1 imaginary2
r
Postgres
Registered Drivers
DBI 2008
HUJI-CS
10
The URLs in HUJI-CS
In CS, a URL has the following structure:
jdbc:postgresql://dbserver/public?user=??
Your login
The machine
running PostgrSQL
You can only access your own account!
DBI 2008
HUJI-CS
11
Interaction with the Database
• We use Statement objects in order to
– Query the db
– Update the db (insert, update, create, drop, …)
• Three different interfaces are used:
Statement, PreparedStatement, CallableStatement
• All are interfaces, hence cannot be instantiated
• They are created by the Connection
DBI 2008
HUJI-CS
12
Querying with Statement
String queryStr =
"SELECT * FROM Member " +
"WHERE Lower(Name) = 'harry potter'";
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(queryStr);
executeQuery returns a ResultSet object
representing the query result (discussed later…)
DBI 2008
HUJI-CS
13
Changing DB with Statement
String deleteStr =
"DELETE FROM Member " +
"WHERE Lower(Name) = 'harry potter'";
Statement stmt = con.createStatement();
int delnum = stmt.executeUpdate(deleteStr);
• executeUpdate is for data manipulation: insert,
delete, update, create table, etc.
– Anything other than querying!
• executeUpdate returns the number of rows modified
(or 0 for DDL commands)
DBI 2008
HUJI-CS
14
About Prepared Statements
• Prepared statements are used for queries that
are executed many times
• Parsed (compiled) by the DBMS only once
• Values of some columns are set after compilation
• Instead of values, use ‘?’ and setType methods
• Hence, prepared statements can be thought of as
statements that contain placeholders to be
substituted later with actual values
DBI 2008
HUJI-CS
15
Querying with PreparedStatement
String q =
"SELECT * FROM Items " +
"WHERE Name = ? and Cost < ?";
PreparedStatement pstmt=con.prepareStatement(q);
pstmt.setString(1, "t-shirt");
pstmt.setInt(2, 1000);
ResultSet rs = pstmt.executeQuery();
DBI 2008
HUJI-CS
16
Updating with PreparedStatement
String dq =
"DELETE FROM Items " +
"WHERE Name = ? and Cost > ?";
PreparedStatement pstmt = con.prepareStatement(dq);
pstmt.setString(1, "t-shirt");
pstmt.setInt(2, 1000);
int delnum = pstmt.executeUpdate();
DBI 2008
HUJI-CS
17
Statement vs. PreparedStatement: Be Careful!
Are these the same? What do they do?
String val = "abc";
PreparedStatement pstmt =
con.prepareStatement("select * from R where A=?");
pstmt.setString(1, val);
ResultSet rs = pstmt.executeQuery();
String val = "abc";
Statement stmt = con.createStatement( );
ResultSet rs =
stmt.executeQuery("select * from R where A=" + val);
DBI 2008
HUJI-CS
18
What can be Assigned to “?”
• Will this work?
PreparedStatement pstmt =
con.prepareStatement("select * from ?");
pstmt.setString(1, myFavoriteTableString);
• No!!! “?” can only represent a column
value (to enable pre-compilation)
DBI 2008
HUJI-CS
19
PreparedStatement and Security
• Suppose Google was implemented in JDBC
without a PreparedStatement. The main DB query
might have been implemented like this:
Statement s;
s.executeQuery("SELECT URL,Title from Internet " +
"WHERE Content LIKE ‘%" + searchString + "%’");
• What would happen if a hacker searched for:
aaaaa’ UNION SELECT Company AS URL, CreditCardNum
AS Title FROM AdvertisingClients WHERE Company LIKE ‘
• This technique is known as SQL Injection and is
the main reason for using PreparedStatements
DBI 2008
HUJI-CS
20
Timeout
• Use setQueryTimeOut(int seconds) of Statement
to set a timeout for the driver to wait for a
query to be completed
• If the operation is not completed in the
given time, an SQLException is thrown
• What is it good for?
DBI 2008
HUJI-CS
21
ResultSet
• ResultSet objects provide access to the
tables generated as results of executing
Statement queries
• Only one ResultSet per Statement or
PreparedStatement can be open at a given
time!
• The table rows are retrieved in sequence
– A ResultSet maintains a cursor pointing to its
current row
– next() moves the cursor to the next row
DBI 2008
HUJI-CS
22
ResultSet Methods
• boolean next()
– Activates the next row
– First call to next() activates the first row
– Returns false if there are no more rows
– Not all of the next calls actually involve the DB
• void close()
– Disposes of the ResultSet
– Allows to re-use the Statement that created it
– Automatically called by most Statement methods
DBI 2008
HUJI-CS
23
ResultSet Methods (cont’d)
• Type getType(int columnIndex)
– Returns the given field as the given type
– Indices start at 1 and not 0!
– Add the column name as a comment if it is known!
• Type getType(String columnName)
– Same, but uses name of field
– Less efficient (but may not be your bottleneck anyway)
• Examples: getString(5), getInt(“salary”), getTime(…),
getBoolean(…), ...
• int findColumn(String columnName)
– Looks up column index given column name
DBI 2008
HUJI-CS
24
ResultSet Example
Statement stmt = con.createStatement();
ResultSet rs = stmt.
executeQuery("select name,age from Employees");
// Print the result
while(rs.next()) {
System.out.print(rs.getString(1) + ":");
System.out.println(rs.getShort("age"));
}
DBI 2008
HUJI-CS
25
Mapping Java Types to SQL Types
SQL Type
CHAR, VARCHAR, LONGVARCHAR
NUMERIC, DECIMAL
BIT
TINYINT
SMALLINT
INTEGER
BIGINT
REAL
FLOAT, DOUBLE
BINARY, VARBINARY, BYTEA
DATE
TIME
TIMESTAMP
DBI 2008
HUJI-CS
Java Type
String
java.math.BigDecimal
boolean
byte
short
int
long
float
double
byte[]
java.sql.Date
java.sql.Time
java.sql.Timestamp
26
Null Values
• In SQL, NULL means the field is empty
• Not the same as 0 or “”!
• In JDBC, you must explicitly ask if the lastread field was null
– ResultSet.wasNull(column)
• For example, getInt(column) will return 0 if
the value is either 0 or NULL!
DBI 2008
HUJI-CS
27
Null Values
When inserting null values into placeholders
of a PreparedStatement:
– Use setNull(index, Types.sqlType) for primitive types
(e.g. INTEGER, REAL);
– For object types (e.g. STRING, DATE) you may
also use setType(index, null)
DBI 2008
HUJI-CS
28
Result-Set Meta-Data
A ResultSetMetaData is an object that can be
used to get information about the properties of
the columns in a ResultSet object
An example: Write the columns of the result set
ResultSetMetaData rsmd = rs.getMetaData();
int numcols = rsmd.getColumnCount();
for (int i = 1 ; i <= numcols; i++)
System.out.print(rsmd.getColumnLabel(i)+" ");
DBI 2008
HUJI-CS
29
Database Time
• Times in SQL are notoriously non-standard
• Java defines three classes to help
• java.sql.Date
– year, month, day
• java.sql.Time
– hours, minutes, seconds
• java.sql.Timestamp
– year, month, day, hours, minutes, seconds,
nanoseconds
– Usually use this one
DBI 2008
HUJI-CS
30
Cleaning Up After Yourself
Remember: close Connections, Statements,
Prepared Statements and Result Sets
con.close();
stmt.close();
pstmt.close();
rs.close()
DBI 2008
HUJI-CS
31
Dealing With Exceptions
An SQLException is actually a list of exceptions
catch (SQLException e) {
while (e != null) {
System.out.println(e.getSQLState());
System.out.println(e.getMessage());
System.out.println(e.getErrorCode());
e = e.getNextException();
}}
DBI 2008
HUJI-CS
32
General SQL Advise
• Take the time to carefully design your database
before you start coding – this will save you time
and frustration
• The same data can be organized in very different
ways – When designing your DB and when
deciding what your primary keys should be,
always think about typical use cases and deduce
from them which queries are most common and
which queries must run fastest
• Add as many constraints as possible (NOT NULL,
UNIQUE, etc…) – this will help you debug prevent
data corruption even if you missed a bug
DBI 2008
HUJI-CS
33
Transaction Management
Transactions and JDBC
• Transaction: more than one statement that must
all succeed (or all fail) together
– e.g., updating several tables due to customer purchase
• Failure− System must reverse all previous actions
• Also can’t leave DB in inconsistent state halfway
through a transaction
• COMMIT = complete transaction
• ROLLBACK = cancel all actions
DBI 2008
HUJI-CS
35
An Example
Suppose that we want to transfer money from
bank account 13 to account 72:
PreparedStatement pstmt = con.prepareStatement("update
BankAccount set amount = amount + ? where accountId = ?");
pstmt.setInt(1,-100);
pstmt.setInt(2, 13);
pstmt.executeUpdate();
pstmt.setInt(1, 100);
pstmt.setInt(2, 72);
pstmt.executeUpdate();
DBI 2008
What happens if
this update fails?
HUJI-CS
36
Transaction Lifetime
• Transactions are not opened and closed explicitly
• A transaction starts on 1st (successful) command
– After a connection is established
– After the previous transaction ends
• A transaction ends when COMMIT or ROLLBACK
are applied
– Either explicitly or implicitly (see next 4 slides)
DBI 2008
HUJI-CS
37
Committing a Transaction
How do we commit?
• Explicitly invoking Connection.commit()
• Implicitly
– After every query execution, if AutoCommit is true
– When the user normally disconnects (i.e.,
appropriately closes the connection)
– In some DBs: After invoking a DDL command
(CREATE, DROP, RENAME, ALTER, …)
DBI 2008
HUJI-CS
38
Automatic Commitment
• A Connection object has a boolean AutoCommit
• If AutoCommit is true (default), then every statement
is automatically committed
• If AutoCommit is false, then each statement is
added to an ongoing transaction
• Change using setAutoCommit(boolean)
• If AutoCommit is false, need to explicitly commit or
rollback the transaction using Connection.commit()
and Connection.rollback()
DBI 2008
HUJI-CS
39
Rolling Back
• Rolling Back: Undoing any change to data
within the current transaction
• The ROLLBACK command explicitly rolls
back (and ends) the current transaction
• ROLLBACK is implicitly applied when the
user abnormally disconnects (i.e., without
appropriately closing the connection)
DBI 2008
HUJI-CS
40
Fixed Example
con.setAutoCommit(false);
try {
PreparedStatement pstmt =
con.prepareStatement("update BankAccount
set amount = amount + ?
where accountId = ?");
pstmt.setInt(1,-100); pstmt.setInt(2, 13); pstmt.executeUpdate();
pstmt.setInt(1, 100); pstmt.setInt(2, 72); pstmt.executeUpdate();
con.commit();
catch (SQLException e) { con.rollback(); …; }
DBI 2008
HUJI-CS
41
Transaction Isolation
• How do different transactions interact?
• Does a running transaction see
uncommitted changes?
• Does it see committed changes?
• Don’t worry about this now – we will
discuss it in future lessons (see the
Appendix if you really can’t wait)
DBI 2008
HUJI-CS
42
Managing Large Objects
In PosgreSQL-JDBC
LOBs: Large OBjects
• A database can store large pieces of data
– e.g., images or other files
• Sometimes, two distinguished types:
– CLOB: Character large object (a large
number of characters)
– BLOB: Binary large object (a lot of bytes)
• Actual data is not stored in the table, only
a pointer (Object ID) to the data storage
DBI 2008
HUJI-CS
44
LOBs in PostgreSQL
Create a new lob
Get an in-stream to Obj. 24
Get an out-stream Obj. 53
LOB
Manager
911
53
MyImages
Name
image1
image2
DBI 2008
Description
conten
t
Human face
53
Rabbit
113
HUJI-CS
24
113
LOBs
45
Storing LOBs (1)
CREATE TABLE MyImages (
name VARCHAR(20) ,
content OID);
• Given: A binary source (file, socket, etc.),
readable through an InputStream object
• Goal: Store the content of the source in the
table MyImages
DBI 2008
HUJI-CS
46
Storing LOBs (2)
1. Disable auto-commit (why???):
con.setAutoCommit(false)
2. Obtain the LOB manager:
• Downcast con into org.postgresql.PGConnection
LargeObjectManager lobm = con.getLargeObjectAPI()
3. Create a new LOB
int oid = lobm.create(LargeObjectManager.READ |
LargeObjectManager.WRITE)
DBI 2008
HUJI-CS
47
Storing LOBs (3)
4. Insert a new row, with the LOB’s oid
PreparedStatement pstmt = con.prepareStatement
("INSERT INTO MyImages VALUES (?, ?)");
pstmt.setString(1, imageName);
pstmt.setInt(2, oid);
pstmt.executeUpdate();
pstmt.close();
DBI 2008
HUJI-CS
48
Storing LOBs (4)
5. Fill the LOB with the input bytes
LargeObject obj = lobm.
open(oid, LargeObjectManager.WRITE);
int bytesRead = 0; byte[] data = new byte[4096];
while ((bytesRead = iStream.read(data)) >= 0)
obj.write(data,0,bytesRead);
6. Finalize
obj.close(); iStream.close(); con.commit();
DBI 2008
HUJI-CS
49
Retrieving LOBs (1)
1. Obtain the LOB ID from the relevant row:
PreparedStatement pstmt = con.prepareStatement
("select bytes from MyImages where name = ?");
pstmt.setString(1, filename);
ResultSet rs = pstmt.executeQuery(); rs.next();
int oid = rs.getInt(1);
rs.close(); pstmt.close();
DBI 2008
HUJI-CS
50
Retrieving LOBs (2)
2. Using oid, obtain a the corresponding LargeObject
instance from the LOB Manager
• Downcast con into org.postgresql.PGConnection
LargeObjectManager lobm = con.getLargeObjectAPI();
LargeObject obj =
lobm.open(oid,LargeObjectManager.READ);
DBI 2008
HUJI-CS
51
Retrieving LOBs (3)
3. Read the content of the large object (that acts as
an input/output stream)
while ((bytesRead = obj.read(data,0,data.length)) > 0)
oStream.write(data, 0, bytesRead);
obj.close(); oStream.close();
DBI 2008
HUJI-CS
52
Appendix: PostgreSQL
Command-Line Interface
Entering the Command-Line UI
psql -hdbserver public
Server running
PostgreSQL
Database Name
Automatically enters your own part of
the database (based on your login)
Recommendation:
> echo ”alias psql ’psql –hdbserver public’” >> ~/.aliases
DBI 2008
HUJI-CS
54
A Screenshot
You can now write
SQL queries
DBI 2008
HUJI-CS
55
Additional Useful Commands
\dt
\d table-name
\dT
\i sql-file
\dl
\lo_export oid file
\lo_import file
\q
DBI 2008
List all tables
Show table definition
List all types
Include SQL from file
List all large objects
Export a large object
Import a large object
Quit
HUJI-CS
56
Transactions in Command Line
• By default, the command-line session runs
in AUTOCOMMIT mode
– That is, commit is implicitly applied after every
command
• To use transactions, you need to explicitly
specify the beginning and end of a
transaction by the commands BEGIN and
• At the end of the transaction, you either
apply COMMIT, ROLLBACK or END
DBI 2008
HUJI-CS
57
Appendix: Byte-Arrays
Representing Byte Arrays
Use the SQL type BYTEA to hold byte arrays
CREATE TABLE MyBStrings (
name VARCHAR(20) ,
content BYTEA);
(works with PostgreSQL, may not work with other DBMSs)
DBI 2008
HUJI-CS
59
Storing Byte Arrays
Store array b[] of bytes:
1. Prepare a row-inserting statement
PreparedStatement pstmt =con.prepareStatement
("INSERT INTO MyBStrings VALUES (?, ?)");
2. Set the statement values:
pstmt.setString(1, desiredName);
pstmt.setBytes(2, b);
3. Execute and close
pstmt.executeUpdate();
pstmt.close();
DBI 2008
HUJI-CS
60
Retrieving Byte Arrays
Retrieve bytes with the name “myBytes”:
1.
Apply a selection query
PreparedStatement pstmt =con.prepareStatement
(“SELECT content from MyBStrings where name=?");
pstmt.setString(1, "myBytes");
ResultSet rs = ps.executeQuery();
2.
Obtain the byte-array from the result
rs.next();
byte[] theBytes = rs.getBytes(1);
3.
Close resources
rs.close(); pstmt.close();
DBI 2008
HUJI-CS
61
Appendix: Transaction
Isolation
Transaction Isolation
• How do different transactions interact?
• Does a running transaction see
uncommitted changes?
• Does it see committed changes?
DBI 2008
HUJI-CS
63
Transaction Isolation Levels
• The isolation level determines the capabilities
of a transaction to read/write data that is
accessed by other transactions
• In PostgreSQL, two levels of isolation:
1. READ COMMITTED (default)
2. SERIALIZABLE
• Each transaction determines its isolation level
– Connection.setTransactionIsolation(int level)
DBI 2008
HUJI-CS
64
Isolation Level in the Command-Line UI
Changing to read committed:
SET SESSION CHARACTERISTICS AS TRANSACTION
ISOLATION LEVEL READ COMMITTED;
Changing to serializable:
SET SESSION CHARACTERISTICS AS TRANSACTION
ISOLATION LEVEL SERIALIZABLE;
DBI 2008
HUJI-CS
65
READ COMMITED & SERIALIZABLE
• In principle, a query in PostgreSQL never reads
uncommitted (i.e. dirty) data
• SERIALIZABLE: During the whole transaction,
statements read only the changes that were
committed by the time the transaction begun (and
the changes made by the transaction itself)
• READ COMMITTED: A statement reads the data
that was committed by the time the statement (not
the transaction) begun
DBI 2008
HUJI-CS
66
Update: Row-Level Locking
• PostgreSQL disables one from updating a row
that is updated by an uncommitted transaction
– The second updating transaction is blocked until the
first one commits or rolls back
• If 2nd update is within READ COMMITTED, then
2nd update is done after 1st transaction commits
or rolls back
• If 2nd update is within SERIALIZABLE, then 2nd
update fails if 1st transaction commits; otherwise,
it succeeds
DBI 2008
HUJI-CS
67
Some Definitions
• Dirty reads: A transaction reads data that is
written by another, uncommitted transaction
• Non-repeatable reads: A transaction rereads
data it previously read and finds that a committed
transaction has modified or deleted that data
• Phantom reads: A transaction re-executes a
query returning a set of rows satisfying a search
condition and finds that a committed transaction
inserted additional rows that satisfy the condition
DBI 2008
HUJI-CS
68
READ COMMITED vs. SERIALIZABLE
Dirty Reads
Non-repeatable
Reads
Phantom
Reads
DBI 2008
READ COMMITED
SERIALIZABLE
Impossible
Impossible
Possible
Impossible
Possible
Impossible
HUJI-CS
69
What Happens Here (1)?
1. CREATE TABLE pairs (x INTEGER, y INTEGER);
2. select * from pairs
3. insert into pairs values(1,1)
4. select * from pairs
5. insert into pairs values(1,2)
6. select * from pairs
7. select * from pairs
8. COMMIT
T.1: R. COMMITTED
9. COMMIT
T.2: SERIALIZABLE
DBI 2008
HUJI-CS
70
What Happens Here (2)?
1. CREATE TABLE pairs (x INTEGER, y INTEGER);
2. insert into pairs values(1,1)
3. select * from pairs
4. COMMIT
5. select * from pairs
6. select * from pairs
7. insert into pairs values(1,2)
8. COMMIT
9. select * from pairs
10. COMMIT
T.2: SERIALIZABLE
T.1: R. COMMITTED
DBI 2008
HUJI-CS
71
What Happens Here (3)?
1. CREATE TABLE pairs (x INTEGER, y INTEGER);
2. insert into pairs values(1,1)
3. select * from pairs
4. COMMIT
5. select * from pairs
6. select * from pairs
7. insert into pairs values(1,2)
8. COMMIT
9. select * from pairs
10. COMMIT
T.2: SERIALIZABLE
T.1: SERIALIZABLE Is it equivalent to any truly serial
execution of the transactions?
DBI 2008
HUJI-CS
72
What Happens Here (4)?
1. CREATE TABLE pairs (x INTEGER, y INTEGER);
2. insert into pairs values(1,1)
3. COMMIT
4. update pairs set y=2
where x=1
5. update pairs set y=3
where x=1
6. select * from pairs
7. COMMIT
8. select * from pairs
9. COMMIT
T.2: SERIALIZABLE
T.1: SERIALIZABLE
DBI 2008
HUJI-CS
73
What Happens Here (5)?
1. CREATE TABLE pairs (x INTEGER, y INTEGER);
2. insert into pairs values(1,1)
3. COMMIT
4. update pairs set y=2
where x=1
5. update pairs set y=3
where x=1
6. select * from pairs
7. COMMIT
8. select * from pairs
9. COMMIT
T.2: SERIALIZABLE
T.1: R. COMMITTED
DBI 2008
HUJI-CS
74
Appendix: Changing
Passwords
Changing Your Password
Suppose that snoopy wants to change his
password from “snoopy” to “snoopass”:
ALTER USER snoopy WITH PASSWORD 'snoopass';
DBI 2008
HUJI-CS
76