Transcript Document

Communicating with
the Outside
Application
Query Processor
Storage Subsystem
Indexes
Concurrency Control
Recovery
Operating System
Hardware
[Processor(s), Disk(s), Memory]
The ways applications
accessing the Database System
• 4GL(fourth generation language)
– Execute at high level resulting in small programs
– Hider portability, different vendors offers different
4GLs
• Programming language + Call Level Interface
– ODBC(Open DataBase Connectivity) for c/c++
– JDBC: Java based API, for Java
– Database specific ODBC drivers: OCI
(C++/Oracle), CLI (C++/ DB2)
– Perl/DBI
API pitfalls
• Cost of portability
– Layer of abstraction on top of ODBC drivers to
hide discrepancies across drivers with
different conformance levels.
– Beware of performance problems in this layer
of abstraction:
• Use of meta-data description when submitting
queries, accessing the result set
• Iterations over the result set
Throughput (records/sec)
ODBC vs. OCI
80
60
40
20
OCI
ODBC
0
0
2000
4000
6000
8000
Number of records transferred
10000
• ODBC vs. OCI on
Oracle8iEE on Windows
2000
• Iteration over a result set
one record at a time.
Prefetching is performed.
• Low OCI overhead when
number of records
transferred is small
• ODBC performs better
when number of records
transferred increases due
to it’s good at prefetching.
Client-Server Mechanisms
• Connection pooling and multiplexing when
multiple clients access a server
• Communication buffer on the database server.
One per connection.
– If a client does not consume results fast enough, then
the server holds resources until it can output the
result.
– Data is sent either when the communication buffer is
full or when a batch is finished executing.
• Small buffer – frequent transfer overhead
• Large buffer – time to first record increases.
• No actual impact on a 100 Mb network. More sensitive in an
intranet with low bandwidth.
Communication buffer
Harmful small objects
• authorized(user, type)
• doc(id, type, date)
• What are the
document instances a
user can see?
• SQL:
select doc.id, doc.date
from authorized, doc
where doc.type =
authorized.type
and authorized.user = <input>
• If documents and
doctypes are
encapsulated in objects,
the risk is the following:
– Find types t authorized for
user input
select doc.type as t
from authorized
where user = <input>
– For each type t issue the
query
select id, date
from doc
where type = <t>;
– The join is executed in the
application and not in the
Tuning the Application Interface
• Avoid user interaction within a transaction
• Minimize the number of roundtrips
between the application and the database
• Retrieve needed columns only
• Retrieve needed rows only
• Minimize the number of query
compilations
Avoid User Interaction
within a Transaction
• User interaction within a transaction forces
locks to be held for a long time.
• Careful transaction design (possibly
transaction chopping) to avoid this
problem.
Minimize the Number of
Roundtrips to the Database
• Rule of thumb: crossing the interface between
the application and the database server is costly
• Avoid Loops:
– Application programming languages offer looping
facilities (SQL statements, cursors, positioned
updates)
• Positioned updates: ODBC allows updating the rows that are
obtained as the result of a query, this forces the processing
of updates one row at a time
– Rigid object-oriented programming might force such
loops.
Avoid External Loops
• No loop:
sqlStmt = “select * from lineitem where l_partkey <= 200;”
odbc->prepareStmt(sqlStmt);
odbc->execPrepared(sqlStmt);
• Loop:
sqlStmt = “select * from lineitem where l_partkey = ?;”
odbc->prepareStmt(sqlStmt);
for (int i=1; i<200; i++)
{
odbc->bindParameter(1, SQL_INTEGER, i);
odbc->execPrepared(sqlStmt);
}
throughput (records/sec)
Avoid External Loops
600
500
400
300
200
100
0
loop
no loop
• SQL Server 2000 on
Windows 2000
• Crossing the application
interface has a significant
impact on performance
Let the DBMS
optimize
set operations
Fetch 2000 records
Loop: 200 queries
No loop: 1 query
Crossing the
application interface
too often hurts
performances.
Avoid Cursors
• No cursor
select * from employees;
• Cursor
DECLARE d_cursor CURSOR FOR select * from
employees;
OPEN d_cursor
while (@@FETCH_STATUS = 0)
BEGIN
FETCH NEXT from d_cursor
END
CLOSE d_cursor
go
Throughput (records/sec)
Avoiding Cursors
5000
4000
3000
2000
1000
0
cursor
SQL
• Cursors are terribly slow
in almost all systems
• Using the cursor, records
are transmitted from the
database server to the
application one at a time.
• Query fetches 200000 56
bytes records
• Response time is a few
seconds with a SQL
query and more than an
hour iterating over a
cursor.
summary