Accessing databases from Perl
Download
Report
Transcript Accessing databases from Perl
Perl/DBI - accessing
databases from Perl
Dr. Andrew C.R. Martin
[email protected]
http://www.bioinf.org.uk/
Aims and objectives
Understand the need to access
databases from Perl
Know why DBI?
Understand the structure of DBI
Be able to write a Perl/DBI script to
read from or write to a database
PRACTICAL: write a script to read from
a database
Why access a database
from Perl?
Why access a database
from Perl?
Send request for page
to web server
Pages
Web browser
Web server
RDBMS
CGI
Script
External
Programs
CGI can extract parameters sent with the page request
Why access a database
from Perl?
Populating databases
Need to pre-process and re-format data
into SQL
Intermediate storage during data
processing
Reading databases
Good database design can lead to
complex queries (wrappers)
Need to extract data to process it
Why use DBI?
Why Perl/DBI?
Many relational databases available.
Commercial examples:
Oracle
DB/2
SQLServer
Sybase
Informix
Interbase
Open source examples:
PostgreSQL
mySQL
Why Perl/DBI?
Databases use a common query
language:
‘structured query language’ (SQL)
Queries can easily be ported between
different database software
Minor variations in more advanced features
Proprietary extensions
Why Perl/DBI?
Can call command-line interface from
within your program.
$result = `psql -tqc “SELECT * FROM table” `;
@tuples = split(/\n/, $result);
foreach $tuple (@tuples)
{
@fields = split(/\|/, $tuple);
}
Inefficient: new process
for each database access
Why Perl/DBI?
Databases generally provide own APIs
to allow access from programming
languages
e.g. C, Java, Perl
Proprietary APIs all differ
Very difficult to port software between
databases
Standardized APIs have thus become
available
Why Perl/DBI?
Perl/DBI is the standardized API for Perl
Easy to port Perl scripts from one database
to another
DBI and ODBC
ODBC (Open DataBase Connectivity)
Consortium of vendors in early 1990s
SQL Access Group
October 1992 & 1993, draft standard:
‘Call Level Interface’ (‘CLI’ - an API)
Never really adopted
Microsoft ‘embraced and extended’ it
to create ODBC
DBI and ODBC
‘DBPerl’ designed as database interface
specifically for Perl4
September 1992 (i.e. pre-ODBC)
Just before release, Perl5 announced
with OO facilities
DBPerl modified to support OO and
loosely modelled on CLI standard
This became DBI
DBI and ODBC
ODBC
Standard SQL syntax
Standard error codes
DBI
Dodged this issue!
Check $DBI::err or
$DBI::errstr (DBI
provides methods for
standard errors, but
drivers don’t use them)
Many attributes and
options to tweak
underlying driver
Very limited control
Meta-data on
database structure
Tables and types
only
DBI and ODBC
There is an ODBC driver for DBI
DBI can be used to access any ODBC
database
The structure of DBI
DBI Architecture
DBI
Oracle
driver
mySQL
driver
PostgreSQL
driver
Sybase
driver
DBI Architecture
Multi-layer design
Perl script
DBI
Database
Independent
DBD
Database API
RDBMS
Database
Dependent
DBI Architecture
Oracle
DBD::Oracle
DBD::Oracle
Perl
Perl
Script
Script
Perl
DBI
Script
DBD::mysql
DBD::Oracle
mySQL
DBD::Pg
DBD::Oracle
PostgreSQL
DBI Architecture
@drivers = DBI->available_drivers();
Returns a list of installed (DBD) drivers
DBI Architecture
DBD::Oracle
Driver Handle
Database
Handle
Statement
Handle
DBD::pg
Driver Handle
Database
Handle
Database
Handle
Statement
Handle
Statement
Handle
Statement
Handle
Statement
Handle
Statement
Handle
Statement
Handle
DBI Architecture
Driver Handles
References loaded driver(s)
One per driver
Not normally referenced in programs
Standard variable name $drh
DBI Architecture
Database Handles
Created by connecting to a database
$dbh = DBI->connect($datasource, ... );
References a database via a driver
Can have many per database (e.g.
accessing different user accounts)
Can access multiple databases
Standard variable name $dbh
DBI Architecture
Statement Handles
Created by ‘preparing’ some SQL
$sth = $dbh->prepare($sql);
Can have many per database handle
(e.g. multiple queries)
Standard variable name $sth
DBI Architecture
DBD::pg
Driver Handle
Database
Handle
$dbh = DBI->connect($datasource, ... );
Statement
Handle
$sth = $dbh->prepare($sql);
SQL Preparation
Perl script
prepare()
DBI and Driver
Pass statement
to database engine
Database
Parse
Statement
if valid
$sth
execute()
fetchrow_array()
Encapsulate as DBI
statement handle
Pass statement
to database engine
Maintain cursor
into results
Encapsulate
Statement
Execute
Statement
Writing Perl/DBI scripts
Accessing DBI from Perl
Must have the DBI package and appropriate
DBD package installed
DBD::Oracle, DBD::Pg, DBD::mysql, etc.
use DBI;
Create a ‘handle’ to access the database:
$dbh = DBI->connect($datasource, $username, $password);
The DBD module
and d/b to be used
Data sources
$dbh = DBI->connect($datasource, $username, $password);
$dbname
= “mydatabase”;
$dbserver = “dbserver.cryst.bbk.ac.uk”;
$dbport
= 5432;
$datasource = “dbi:Oracle:$dbname”;
$datasource = “dbi:mysql:database=$dbname;host=$dbserver”;
$datasource = “dbi:Pg:dbname=$dbname;host=$dbserver;port=$dbport”;
Format varies with
database module
Optional; supported by
some databases.
Default: local machine
and default port.
Username and password
$dbh = DBI->connect($datasource, $username, $password);
$username and $password also optional
Only needed if you normally need a
username/password to connect to the
database.
Remember CGI scripts run as a special webserver user.
Generally, ‘nobody’ or ‘apache’.
Database must allow access by this user
or specify a different username/password
SQL commands with no
return value
SQL commands other that SELECT don’t
return values
may return success/failure flag
number of entries in the database affected
For example:
creating a table
inserting a row
modifying a row
SQL commands with no
return value
e.g. insert a row into a table:
INSERT INTO idac VALUES (‘LYC_CHICK’, ‘P00698’)
From Perl/DBI:
$sql = “INSERT INTO idac VALUES (‘LYC_CHICK’, ‘P00698’)”;
$dbh->do($sql);
SQL commands that return
a single row
Sometimes, can guarantee that a database
query will return only one row
or you are only interested in the first row
$sql = “SELECT * FROM idac WHERE ac = ‘P00698’”;
@values = $dbh->selectrow_array($sql);
Columns placed in an array
Could also have been placed in a list:
($id, $ac) = $dbh->selectrow_array($sql);
SQL commands that return
a multiple rows
Most SELECT statements will return many rows
Three stages must be performed:
preparing the SQL
executing it
extracting the results
SQL commands that return
a multiple rows
$sql = “SELECT * FROM idac”;
$sth = $dbh->prepare($sql);
if($sth->execute)
{
while(($id, $ac) = $sth->fetchrow_array)
{
print “ID: $id AC: $ac\n”;
}
}
(Can also obtain array or hash reference)
NB: statement handle / fetchrow_array
rather than db handle / selectrow_array
SQL commands that return
a multiple rows
If you need to stop early you can do:
$sql = “SELECT * FROM idac”;
$sth = $dbh->prepare($sql);
if($sth->execute)
{
for($i=0; $i<10; $i++)
{
if(($id, $ac) = $sth->fetchrow_array)
{
print “ID: $id AC: $ac\n”;
}
}
$sth->finish;
}
SQL commands that return
a multiple rows
A utility method is also available to print a
complete result set:
$sql = “SELECT * FROM idac”;
$sth = $dbh->prepare($sql);
if($sth->execute)
{
$nrows = $sth->dump_results;
}
(Mostly useful for debugging)
Repeated SQL calls
Often want to repeat essentially the same
query, but with some different value being
checked.
For example:
foreach $ac (‘P00698’, ‘P00703’)
{
$sql = “SELECT * FROM idac WHERE ac = ‘$ac’”;
@values = $dbh->selectrow_array($sql);
print “@values\n”;
}
(using special option for 1-row returns)
Repeated SQL calls
Could also be do:
foreach $ac (‘P00698’, ‘P00703’)
{
$sql = “SELECT * FROM idac WHERE ac = ‘$ac’”;
$sth = $dbh->prepare($sql);
$sth->execute;
while(@values = $sth->fetchrow_array)
{
print “@values\n”;
}
}
i.e. don’t use special option for 1-row returns
$dbh->selectrow_array($sql)
Repeated SQL calls
Increase in performance by ‘binding’ a variable:
$sql = “SELECT * FROM idac WHERE ac = ?”;
$sth = $dbh->prepare($sql);
foreach $ac (‘P00698’, ‘P00703’)
{
$sth->bind_param(1, $ac);
$sth->execute;
while(@values = $sth->fetchrow_array)
{
print “@values\n”;
Parameter
Variable
}
Number
to bind
}
Repeated SQL calls
NOTE:
Performance increase depends on database
and driver
Although strings normally enclosed in single
inverted commas, the bound variable is not
quoted.
If you have a number which you need to be
treated as a string, then you do:
$sth->bind_param(1, 42, SQL_VARCHAR);
Summary
DBI provides a standard API
It does not standardize the SQL
DBI is an older standard than ODBC
They can be used together and they are both
evolving
Basic 3-step process:
prepare / execute / fetch
Shortcut calls for no return or 1-row return
Many other functions available