disk i/o`s - Progress Community

Download Report

Transcript disk i/o`s - Progress Community

DB-15: Developing
Performance-oriented
ODBC/JDBC OpenEdge™
Applications
John Goodson
Vice President, DataDirect, R&D
Agenda

Topic: Understanding how drivers
work and what factors influence
performance

Topic: Guidelines for improving
performance

Topic: Review
2 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
At The End, You Should Understand:

How driver implementations trade off
performance for functionality

How to design for performance

How to avoid pitfalls in some drivers
3 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
ODBC




Provides a common API to all data
stores
Cross platform
Initially a Microsoft standard based on a
draft pre-release of the X/Open SQL
Access Group CLI
ODBC 3.0 aligns with the ISO and ANSI
CLI standards
4 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
JDBC





JDBC is a Java™ interface for accessing
SQL data sources
Based on ODBC, Open Group SQL CLI,
and ANSI/ISO CLI
JDBC 1.0 designed for basic usage with
emphasis on ease of use
JDBC 2.0 enhanced for advanced
features and server-side use
JDBC 3.0 goal was to “round out” API by
providing missing functionality
5 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Performance Issues




Developing code that works is hard
enough without having to make it work
fast!
Tough to know all the specifics of a
single backend
No exceptions are thrown to say your
code is running too slow
Most performance “help” is DBMS
server focused … not middleware
focused
6 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
How Do Drivers Work?



Each database vendor defines a communication
protocol of how the database server will
communicate with clients
This communication protocol is defined in terms
of packets that will be transferred using a
network transport (i.e. information will be
encoded into a specific database defined format
that will be sent over the network (tcp-ip) to/from
clients)
The database server only understands requests
that are sent in this format
7 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
How Do Drivers Work – Part 2

Every database vendor defines their packet
descriptions and their model differently
– Oracle and DB2 are cursor based models
– SQL Server and Sybase are streaming models
– Progress OpenEdge is a cursor based model

Every JDBC request will most likely result in
many packets being sent to/from the database
– Why?
– (1) packets are fixed size and data might not fit in
one packet
– (2) JDBC API does not map 1:1 with database
communication protocol
8 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Why Do We Care How Drivers Work?

The two slowest things in a database
system are
– Network i/o
– Disk i/o

The goal is to keep applications using the
CPU and not waiting on movement of data
across a network or to/from a storage
device.
9 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do?
–
–
–
–
–
–
–
Connect
Look for data
Read data
Change data
Delete data
Make changes permanent (commit)
Disconnect
10 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
What Happens When We Connect?









Login credentials verified by dbms
Shared memory (allocated and) assigned to
connection
– Depending on db between 24k and 64k
Various queues updated with connection
presence
Connection recorded in audit files on disk
Database context established
Code page set
Transaction information initialized
Connection properties applied
Initialization statements executed
11 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Connecting

Connecting is the single slowest thing you can do
in a well tuned application
– Lots of network i/o’s … disk i/o’s … CPU cycles

Using connection pooling, we can make the
slowest (network i/o and disk i/o intensive)
operation in your application one of the fastest
(using no network i/o or disk i/o)

With JDBC, Connection pooling requires use of
JDBC DataSources … don’t connect via Driver
Manager
12 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do?







Connect
Look for data
Read data
Change data
Delete data
Make changes permanent (commit)
Disconnect
13 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Looking for Data – Catalog
Functions/DatabaseMetaData

Compared to other xDBC methods, catalog
calls/DatabaseMetaData methods that return
ResultSet objects are slow
– getTables, getColumns, getProcedures, …

xDBC drivers have to execute complex
queries – and many times multiple queries – in
order to return the results mandated by the
xDBC specs.
14 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Getting a List of Tables

What you want: a list of all tables that I have
starting with the prefix NYC

ResultSet WSrs = WSc.getTables (null, null,
“NYC%”, null);
15 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Getting a List of Tables (part 2)

What a driver thinks you want: A list of all
tables, views, synonyms, system tables,
aliases, or temporary tables owned by any
schema in any catalog/database starting
with the prefix NYC

ResultSet WSrs = WSc.getTables (“cat1”,
“johng”, “NYC%”, “TABLE”);
16 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Avoid Search Patterns for
DatabaseMetaData Methods


Using null arguments or search patterns to
Catalog/DatabaseMetaData methods results in
time-consuming queries and, potentially,
increases network traffic due to unwanted
results.
Common places to see null that should be
“fixed”
– Schema name (most all)
– Table type (getTables/SQLTables)
– Catalog name (most all)
17 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Oracle SQL … getProcedures … yuck!
select a.owner, a.object_name, INSTR(a.object_type, 'PROC'), '1' from
sys.all_objects a where a.object_type IN ('FUNCTION','PROCEDURE')
and a.status = 'VALID'
union all select c.owner, c.synonym_name, INSTR(a.object_type, 'PROC'),
'1' from sys.all_objects a, sys.all_synonyms c
where c.table_owner = a.owner and c.table_name = a.object_name and
a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID'
union all select distinct b.owner, CONCAT(b.package_name, '.' ||
b.object_name), min(b.position), max(b.overload) from
sys.all_arguments b
where b.package_name IS NOT NULL group by b.owner,
CONCAT(b.package_name, '.' || b.object_name)
union all select distinct c.owner, CONCAT(c.synonym_name, '.' ||
b.object_name), min(b.position), max(b.overload) from
sys.all_arguments b, sys.all_synonyms c
where c.table_owner = b.owner and c.table_name = b.package_name and
b.package_name IS NOT NULL group by c.owner,
CONCAT(c.synonym_name, '.' || b.object_name)
union all select distinct c.owner, c.synonym_name, min(b.position),
max(b.overload) from sys.all_arguments b, sys.all_synonyms c
where c.owner = b.owner and c.table_owner=b.package_name and
c.table_name=b.object_name group by c.owner, c.synonym_name
18 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
getTypeInfo

Even getTypeInfo may result in a network
round-trip
– Do this once and cache the information
19 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Use Prepared Statement Metadata
Instead of getColumns (when possible)

getColumns/SQLColumns
– driver sends a potentially complex SQL query to the server
– SQL is preprocessed and executed on the server
– driver must then obtain result set descriptions, bind results,
and retrieve rowset from the server
– Usually at least 3 network i/o’s (probably more)

ResultSetMetaData/SQLDescribeCol
– Simple query, which evaluates to no rows, sent to server –
NOT EXECUTED
– Only result set information obtained
– Usually 2 network i/o (prepare + close)
20 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Example
PreparedStatement pstmt =
WScon.prepareStatement (“select * from
StockPurchasePlan where 0 = 1”);
ResultSetMetaData rsmd = pstmt.getMetaData();
int count = rsmd.getColumnCount;
// this code would replace
// getColumns (“ct1”, “johng”,
//
“StockPurchasePlan”, “%”);
21 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do?







Connect
Look for data
Read data
Change data
Delete data
Make changes permanent (commit)
Disconnect
22 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Prepared Statements

Many times our applications do the same
things over and over again by specifying SQL
that is parameterized

SELECT firstname, lastname, address, phone
FROM employees WHERE empid = ?

UPDATE balance SET balance = balance + ?
WHERE acctid = ?
23 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Prepared Statements (2)



Many times prepared statements are not as
efficient as executing the SQL directly (using
the Statement object in JDBC/SQLExecDirect
in ODBC)
Use prepared statements if the query is to be
executed more than twice in this session OR if
it will be executed frequently by the system
(JDBC)
Execute directly if the query is to be executed
once or twice
24 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Prepared Statement Side Effect

Side effect: There is no way to control what
gets put in the prepared statement pool. If
you create a prepared statement, then it’s
going into the pool and might remove some
other statement you want to stay in the pool.
25 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Even More on Prepared Statements

Prepare, Execute, Describe
– Always preferred
– Saves network roundtrip with most databases

Prepare, Describe, Execute
– Always slower
26 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Stored Procedure Invocation

Use parameter markers for arguments to
stored procedures … do not use literal
arguments
{call getFunds (?,?,?)}
// good
{call getFunds (1234, ‘JPG’, 53.23)}
27 © 2005 Progress Software Corporation
// bad
DB-15 ODBC/JDBC OpenEdge™ Applications
Why Not Use Literal Arguments?

The spec says not to use literals

More code for the programmer, but executions
can be done via RPCs (remote procedure
calls) instead of executed as language events

Use of RPCs improves performance by
– Eliminating parameter processing on server
– Parsing done on server
28 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Example:


{call foo (12,’abc’)}
Server gets a generic SQL packet
– Parse SQL
– Semantically validate


Does procedure foo exist?
Is foo defined to have 2 arguments?
– SQL is always a Unicode character string




Look up data type for foo’s parameter 1 (int)
Convert ’12’ to an integer
Look up data type for foo’s parameter 2 (varchar)
Convert ‘abc’ to varchar
– Execute
29 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Example:

{call foo (?,?)}

Server gets a specific stored procedure packet
where the packet structure contains the
arguments in the defined server format for the
parameters
– Execute only
30 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Understand Cursor Types

Forward only cursors are always the fastest
cursor type

(most of the time) static/insensitive scrollable
cursors will retrieve all rows for the query to
the client

(most of the time) keyset/sensitive scrollable
cursors will retrieve all keys to the client, then
each fetch (next) will result in an execute
31 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Forward-Only Cursors

driver will pre-fetch rows from the server

For small result sets, can fit entire contents of
the result set in one network roundtrip

Side effects:
– First fetch will be slower than subsequent
fetches for application
– Cursor position is not where you “think” it is
– Read Committed data might have already been
read
32 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Scroll Insensitive/Static Cursors

drivers will usually cache the entire contents
of the result set on the client

“Lazy fetch” model … only fetch the data
when you have to.

If you request a scroll sensitive cursor and the
driver can’t support a scroll sensitive cursor,
then it will downgrade to scroll insensitive

Not scalable!
33 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Keyset/Scroll Sensitive Cursors

drivers usually cache a set of keys to the
client

Example, ‘select * from table’ with Oracle
– Results in the driver executing: ‘select rowid
from table’
– Retrieves all rowids
– Each call to SQLFetch/rs.next() will result in the
driver executing ‘select * from table where
rowid = ?’ and retrieving the result
34 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Use Batches Instead of Prepared
Statements


Bulk load of data has typically been done by
preparing an INSERT statement and executing
many times.
Preparing an INSERT statement and
executing 100 times:
– 101 network round-trips

Using batch queries/addBatch() to compound
100 inserts:
– 1 network round-trip

Network round-trips are more expensive than
DBMS CPU cycles
35 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Arrays

In ODBC, use SQLParamOptions to
execute multiple parameterized queries
with one network roundtrip

Not as easy in JDBC; however, use
batch queries and eventually all drivers
will optimize to use parameter arrays
36 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Retrieving JDBC Data

The hard work is not just positioning to the
data, follow a few more guidelines when
putting data into buffers

Use getter methods with result column
ordinals instead of with column names.
String fullname = rs.getString (3); // good
String fullname = rs.getString (“lname”); // bad
37 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
JDBC Retrieving Data (2)

Only use getObject() unless it’s absolutely
necessary.
String children = (String) rs.getObject (2); // bad
Int children = rs.getInt (2);
38 © 2005 Progress Software Corporation
// fast
DB-15 ODBC/JDBC OpenEdge™ Applications
Avoid Long Data (Including LOBs)

Select * from table
– Easy! Most people do this!
– Slow!



Even if you don’t ask for column data, a
performance penalty has been paid as the data
has already been moved across the network to
the client
Particularly bad when the table contains long
varchar/long varbinary/LOB data
Compounded because the driver is trying to
array fetch … which is might not be able to do
39 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
CLOB/BLOB

Using CLOB and BLOB objects sparingly

Although you might think that only parts of the
LOB are being transported across the
network, most database wire protocols retrofit
LOBs into existing packet structures

JDBC drivers emulate getClob/getBlob by
retrieving the entire object locally
40 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Various other Guidelines

Use getGeneratedKeys()
– Saves network i/o’s

Don’t call rs.last() or fetch absolute last

Don’t write code that relies on the number of
result rows from a query
– Drivers have no idea how many rows a query
will return
– Drivers have to fetch all rows to know how
many rows the query will return
41 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
More guidelines

Understand:
– setMaxRows
– setMaxFieldSize

Both limit network i/o by requesting that the
server not send data across the wire

Side effects:
– setMaxRows also applies to updates/deletes in
some databases
– These options will persist in the connection
42 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do?







Connect
Look for data
Read data
Change data
Delete data
Make changes permanent (commit)
Disconnect
43 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Transactions

Committing transactions is expensive!
– Network i/o’s
– Numerous non-sequential disk i/o’s

Don’t run in auto-commit mode

Distributed txs are at least 4 times as slow as
local transactions

Use DTC/JTA only if required!
44 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
What Will Applications Do?







Connect
Look for data
Read data
Change data
Delete data
Make changes permanent (commit)
Disconnect
45 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Summary

Avoid:
– Network i/o
– Disk i/o

Think about what drivers are doing to process
your requests

Stop lazy programming ... Fill in the null
arguments
46 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Summary

Avoid network I/O

Avoid disk I/O

Avoid sloppy programming
47 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Questions?
48 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
Thank you for
your time!
49 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications
50 © 2005 Progress Software Corporation
DB-15 ODBC/JDBC OpenEdge™ Applications