Accessing databases from Perl

advertisement
Perl/DBI - accessing
databases from Perl
Dr. Andrew C.R. Martin
martin@biochem.ucl.ac.uk
http://www.bioinf.org.uk/
Aims and objectives
Understand the need to access
databases from Perl
Know why DBI?
Understand the structure of DBI
Be able to write a Perl/DBI script to
read from or write to a database
PRACTICAL: write a script to read from
a database
Why access a database
from Perl?
Why access a database
from Perl?
Send request for page
to web server
Pages
Web browser
Web server
RDBMS
CGI
Script
External
Programs
CGI can extract parameters sent with the page request
Why access a database
from Perl?
Populating databases
Need to pre-process and re-format data
into SQL
Intermediate storage during data
processing
Reading databases
Good database design can lead to
complex queries (wrappers)
Need to extract data to process it
Why use DBI?
Why Perl/DBI?
Many relational databases available.
Commercial examples:
Oracle
DB/2
SQLServer
Sybase
Informix
Interbase
Open source examples:
PostgreSQL
mySQL
Why Perl/DBI?
Databases use a common query
language:
‘structured query language’ (SQL)
Queries can easily be ported between
different database software
Minor variations in more advanced features
Proprietary extensions
Why Perl/DBI?
Can call command-line interface from
within your program.
$result = `psql -tqc “SELECT * FROM table” `;
@tuples = split(/\n/, $result);
foreach $tuple (@tuples)
{
@fields = split(/\|/, $tuple);
}
Inefficient: new process
for each database access
Why Perl/DBI?
Databases generally provide own APIs
to allow access from programming
languages
e.g. C, Java, Perl
Proprietary APIs all differ
Very difficult to port software between
databases
Standardized APIs have thus become
available
Why Perl/DBI?
Perl/DBI is the standardized API for Perl
Easy to port Perl scripts from one database
to another
DBI and ODBC
ODBC (Open DataBase Connectivity)
Consortium of vendors in early 1990s
SQL Access Group
October 1992 & 1993, draft standard:
‘Call Level Interface’ (‘CLI’ - an API)
Never really adopted
Microsoft ‘embraced and extended’ it
to create ODBC
DBI and ODBC
‘DBPerl’ designed as database interface
specifically for Perl4
September 1992 (i.e. pre-ODBC)
Just before release, Perl5 announced
with OO facilities
DBPerl modified to support OO and
loosely modelled on CLI standard
This became DBI
DBI and ODBC
ODBC
Standard SQL syntax
Standard error codes
DBI
Dodged this issue!
Check $DBI::err or
$DBI::errstr (DBI
provides methods for
standard errors, but
drivers don’t use them)
Many attributes and
options to tweak
underlying driver
Very limited control
Meta-data on
database structure
Tables and types
only
DBI and ODBC
There is an ODBC driver for DBI
DBI can be used to access any ODBC
database
The structure of DBI
DBI Architecture
DBI
Oracle
driver
mySQL
driver
PostgreSQL
driver
Sybase
driver
DBI Architecture
Multi-layer design
Perl script
DBI
Database
Independent
DBD
Database API
RDBMS
Database
Dependent
DBI Architecture
Oracle
DBD::Oracle
DBD::Oracle
Perl
Perl
Script
Script
Perl
DBI
Script
DBD::mysql
DBD::Oracle
mySQL
DBD::Pg
DBD::Oracle
PostgreSQL
DBI Architecture
@drivers = DBI->available_drivers();
Returns a list of installed (DBD) drivers
DBI Architecture
DBD::Oracle
Driver Handle
Database
Handle
Statement
Handle
DBD::pg
Driver Handle
Database
Handle
Database
Handle
Statement
Handle
Statement
Handle
Statement
Handle
Statement
Handle
Statement
Handle
Statement
Handle
DBI Architecture
Driver Handles
References loaded driver(s)
One per driver
Not normally referenced in programs
Standard variable name $drh
DBI Architecture
Database Handles
Created by connecting to a database
$dbh = DBI->connect($datasource, ... );
References a database via a driver
Can have many per database (e.g.
accessing different user accounts)
Can access multiple databases
Standard variable name $dbh
DBI Architecture
Statement Handles
Created by ‘preparing’ some SQL
$sth = $dbh->prepare($sql);
Can have many per database handle
(e.g. multiple queries)
Standard variable name $sth
DBI Architecture
DBD::pg
Driver Handle
Database
Handle
$dbh = DBI->connect($datasource, ... );
Statement
Handle
$sth = $dbh->prepare($sql);
SQL Preparation
Perl script
prepare()
DBI and Driver
Pass statement
to database engine
Database
Parse
Statement
if valid
$sth
execute()
fetchrow_array()
Encapsulate as DBI
statement handle
Pass statement
to database engine
Maintain cursor
into results
Encapsulate
Statement
Execute
Statement
Writing Perl/DBI scripts
Accessing DBI from Perl
 Must have the DBI package and appropriate
DBD package installed
 DBD::Oracle, DBD::Pg, DBD::mysql, etc.
use DBI;
 Create a ‘handle’ to access the database:
$dbh = DBI->connect($datasource, $username, $password);
The DBD module
and d/b to be used
Data sources
$dbh = DBI->connect($datasource, $username, $password);
$dbname
= “mydatabase”;
$dbserver = “dbserver.cryst.bbk.ac.uk”;
$dbport
= 5432;
$datasource = “dbi:Oracle:$dbname”;
$datasource = “dbi:mysql:database=$dbname;host=$dbserver”;
$datasource = “dbi:Pg:dbname=$dbname;host=$dbserver;port=$dbport”;
Format varies with
database module
Optional; supported by
some databases.
Default: local machine
and default port.
Username and password
$dbh = DBI->connect($datasource, $username, $password);
 $username and $password also optional
 Only needed if you normally need a
username/password to connect to the
database.
 Remember CGI scripts run as a special webserver user.
 Generally, ‘nobody’ or ‘apache’.
 Database must allow access by this user
 or specify a different username/password
SQL commands with no
return value
 SQL commands other that SELECT don’t
return values
 may return success/failure flag
 number of entries in the database affected
 For example:
 creating a table
 inserting a row
 modifying a row
SQL commands with no
return value
 e.g. insert a row into a table:
INSERT INTO idac VALUES (‘LYC_CHICK’, ‘P00698’)
 From Perl/DBI:
$sql = “INSERT INTO idac VALUES (‘LYC_CHICK’, ‘P00698’)”;
$dbh->do($sql);
SQL commands that return
a single row
 Sometimes, can guarantee that a database
query will return only one row
 or you are only interested in the first row
$sql = “SELECT * FROM idac WHERE ac = ‘P00698’”;
@values = $dbh->selectrow_array($sql);
 Columns placed in an array
 Could also have been placed in a list:
($id, $ac) = $dbh->selectrow_array($sql);
SQL commands that return
a multiple rows
 Most SELECT statements will return many rows
 Three stages must be performed:
 preparing the SQL
 executing it
 extracting the results
SQL commands that return
a multiple rows
$sql = “SELECT * FROM idac”;
$sth = $dbh->prepare($sql);
if($sth->execute)
{
while(($id, $ac) = $sth->fetchrow_array)
{
print “ID: $id AC: $ac\n”;
}
}
(Can also obtain array or hash reference)
NB: statement handle / fetchrow_array
rather than db handle / selectrow_array
SQL commands that return
a multiple rows
 If you need to stop early you can do:
$sql = “SELECT * FROM idac”;
$sth = $dbh->prepare($sql);
if($sth->execute)
{
for($i=0; $i<10; $i++)
{
if(($id, $ac) = $sth->fetchrow_array)
{
print “ID: $id AC: $ac\n”;
}
}
$sth->finish;
}
SQL commands that return
a multiple rows
 A utility method is also available to print a
complete result set:
$sql = “SELECT * FROM idac”;
$sth = $dbh->prepare($sql);
if($sth->execute)
{
$nrows = $sth->dump_results;
}
(Mostly useful for debugging)
Repeated SQL calls
 Often want to repeat essentially the same
query, but with some different value being
checked.
 For example:
foreach $ac (‘P00698’, ‘P00703’)
{
$sql = “SELECT * FROM idac WHERE ac = ‘$ac’”;
@values = $dbh->selectrow_array($sql);
print “@values\n”;
}
(using special option for 1-row returns)
Repeated SQL calls
 Could also be do:
foreach $ac (‘P00698’, ‘P00703’)
{
$sql = “SELECT * FROM idac WHERE ac = ‘$ac’”;
$sth = $dbh->prepare($sql);
$sth->execute;
while(@values = $sth->fetchrow_array)
{
print “@values\n”;
}
}
i.e. don’t use special option for 1-row returns
$dbh->selectrow_array($sql)
Repeated SQL calls
 Increase in performance by ‘binding’ a variable:
$sql = “SELECT * FROM idac WHERE ac = ?”;
$sth = $dbh->prepare($sql);
foreach $ac (‘P00698’, ‘P00703’)
{
$sth->bind_param(1, $ac);
$sth->execute;
while(@values = $sth->fetchrow_array)
{
print “@values\n”;
Parameter
Variable
}
Number
to bind
}
Repeated SQL calls
NOTE:
 Performance increase depends on database
and driver
 Although strings normally enclosed in single
inverted commas, the bound variable is not
quoted.
 If you have a number which you need to be
treated as a string, then you do:
$sth->bind_param(1, 42, SQL_VARCHAR);
Summary
 DBI provides a standard API
 It does not standardize the SQL
 DBI is an older standard than ODBC
 They can be used together and they are both
evolving
 Basic 3-step process:
 prepare / execute / fetch
 Shortcut calls for no return or 1-row return
 Many other functions available
Download