mp2intro - beamartyr.net

Download Report

Transcript mp2intro - beamartyr.net

Introduction to mod_perl
Issac Goldstand
Mirimar Networks
www.mirimar.net
[email protected]
mod_what?
• Apache module which exposes the entire
Apache API via Perl
• Allows us to write handlers and filters
• mod_perl is for Apache 1.3, mod_perl2 is
for Apache 2.x
CGI on Steroids™
• Classic known use of mod_perl is to simply
cache compiled Perl code in memory
• Don’t put this down: it’s been known to be up to
100 times faster than CGI scripts with mod_cgi
• (In mod_perl2, this is accomplished by using
ModPerl::Registry to handle all CGI scripts)
• But… That’s barely scratching the surface of
mod_perl’s abilities
What’s your handle?
• Apache defines a request chain which
each request must pass through
• Along this chain are various hooks
• A module defines a handler which
intercepts the request at a specific hook
and decides how to handle the request
I/O Filters
• Perl modules exist to capture STDOUT
stream
– Limited to STDOUT only
– Not the easiest interfaces in the world…
– Can only capture output from other Perl code
• Apache 2.0 introduces I/O filters for both
incoming requests and outgoing
responses.
Protocol Handlers
• Apache 2.0 is designed to be a more
generic server which can handle protocols
other than HTTP
• Simple implementation of SMTP exists in
Perl (Apache::SMTP)
• You can write your own custom protocols
• This is somewhat beyond the scope of this
lecture
PostReadRequest
• Hook that comes immediately after
request is read by Apache and headers
are processed
• Useful for preprocessing that needs to be
done once per request
• Example: Set special Perl variable $^T
($BASETIME) for use with –M –A and –C
filetests
Trans(lation)
• Used to manipulate (or “translate”) the
requested URI
• Many “standard” modules, such as mod_alias
and mod_rewrite use this heavily
• Useful to rewrite a “magic” URI into a more
standard query-string
• Example:
http://news.com/stories/27/02/2006/005.html
becomes http://news.com/cgibin/story?day=27&month=02&year=2006&story
=005
MapToStorage
• Used to map final URI to a physical file
• In Apache 1.3, this was part of Trans phase
• Normally, default behavior is what you’ll want
here – mod_alias will usually do anything fancy
you could want
• Useful for shaving some time off the request by
bypassing Apache’s attempt to find a physical
file if you have a custom response handler
HeaderParser
• Similar to PostReadRequest, except now
we have mapped the request to it’s final
<Location>, <Directory>, <File>, etc
containers
• Useful to determine if we need to do
anything special with the requests before
processing the heavy-lifting parts of the
request phase
Init
• Special “shortcut” handler that points to either
PostReadRequest or HeaderParser, depending
on it’s context
• In <Location> or other blocks that refer to a
specific URI, physical path or filename, it will be
HeaderParser
• Anywhere else, it will be PostReadRequest
• Regardless, it is guaranteed to be the first
handler called for it’s given context
Access
• First of the “AAA” series
• Used for access restrictions
– IP Based
– Time/Date based
– Other global, and “external” factors not
connected to the specific user
• Useful if we want to deny access to all
users based on certain criteria
Authentication (Authen)
• Second AAA handler
• Used to verify the credentials presented by a
user
– Perform login
– Validate existing session/ticket/cookie
• Only performed when the requested resource is
marked as password protected – meaning we’ll
need AuthName, AuthType and at least one
require directive (or their custom equivalents)
Authorization (Authz)
• Final AAA phase
• Decides whether the credentials validated
in the Authentication phase can access the
requested resource
• Closely linked to Authen phase, and will
only run after a successful Authen handler
Type
• Used to set the proper MIME type
(Content-Type) for the response
• Typically, also sets other type information,
such as content language
• Tends not to be overly used, as it’s an “all
or nothing” scenario, and things can still
be changed in the Fixup phase
Fixup
• Last chance to change things before the
response is generated
• A good chance to populate the environment with
variables (such as mod_env, and mod_ssl in
Apache 1)
• Useful for passing information in an arbitrary
manner to the response handler (which may be
an external process, such as with mod_cgi, or in
a non-apache specific language, such as
Registry scripts, PHP, Ruby, Python, etc)
Response
• The most popularly used handler phase
• Arguably the most important phase
• Used to generate and return the response
to the client
• We’ll use this handler when (time
permitting) we show an example
Log
• Just because we’ve satisfied the client doesn’t
mean our work is done!
• Used to log the information known about the
request, user and response
• Executed regardless of what happens in the
previous handlers
• Can be used to add custom information to the
logs (ads displayed, cookie values, etc)
Cleanup
• The last phase
• Actually, exists only in mod_perl and not in
Apache itself
• Happens after the client is disconnected, but
before request object is destroyed
• Used to cleanup anything after the request
(temp files, locks, etc)
• While the client won’t have to wait for this,
Apache will not re-queue the Perl interpreter
until it finishes!
Example: Hello World
• The moment we’ve all been waiting for!
#file:MyApache2/Hello.pm
package MyApache2::Hello;
use strict;
use warnings;
use Apache2::RequestRec ();
use Apache2::RequestIO ();
use Apache2::Const -compile => 'OK';
sub handler {
my $r = shift;
$r->content_type('text/plain');
$r->print('Hello, Apache2/mod_perl World!');
return Apache2::Const::OK;
}
1;
Preparing httpd.conf
# Don’t forget to AddModule perl_module modules/mod_perl.so
PerlModule MyApache2::Hello
<Location /hello>
SetHandler modperl
PerlResponseHandler MyApache2::Hello
</Location>
Response
Hello, Apache2/mod_perl World!
I/O Filters
• Used to make transformations to the
request and response without modifying
the “core” handler code
• Multiple filters can be used in conjunction
with one another
• Example: Content compression,
obfuscation, machine translation (eg,
English -> German)
Buckets and Bucket Brigades
• In the filter
implementation,
chunks of data, called
buckets, are passed
between the various
filters
• The pipeline of
buckets is called a
bucket brigade
Disclaimer
• I don’t like the picture on the next slide
• It’s the picture used on the mod_perl
website (and the most popular and only
useful picture in google images when
searching for bucket brigades)
• I personally find it to be somewhat
inaccurate, but since lots of other people
seem to find it helpful, we’ll look (or at
least glance) at it anyway
My humble version
 From request handler
To network 
OUTPUT
FILTER
Modifying data
 From request handler
DA
TA
DA
TA
DA
TA
To network 
AD
AT
OUTPUT
FILTER
Adding data
 From request handler
DA
TA
DA
TA
DA
TA
To network 
NEW
OUTPUT
FILTER
AD
AT
Removing data
 From request handler
DA
TA
DA
TA
DA
TA
To network 
NEW
OUTPUT
FILTER
AD
AT
Bucket brigade code sample
#file:MyApache2/InputRequestFilterLC.pm
package MyApache2::InputRequestFilterLC;
use strict;
use warnings;
use base qw(Apache2::Filter);
use Apache2::Connection ();
use APR::Brigade ();
use APR::Bucket ();
use Apache2::Const -compile => 'OK';
use APR::Const
-compile => ':common';
sub handler : FilterRequestHandler {
my ($f, $bb, $mode, $block, $readbytes) = @_;
my $c = $f->c;
my $bb_ctx = APR::Brigade->new($c->pool, $c->bucket_alloc);
my $rv = $f->next->get_brigade($bb_ctx, $mode, $block, $readbytes);
return $rv unless $rv == APR::Const::SUCCESS;
while (!$bb_ctx->is_empty) {
my $b = $bb_ctx->first;
if ($b->is_eos) {
$bb->insert_tail($b);
last;
}
my $len = $b->read(my $data);
$b = APR::Bucket->new($bb->bucket_alloc, lc $data) if $len;
$b->remove;
$bb->insert_tail($b);
}
Apache2::Const::OK;
}
1;
Looks fun, eh?
• Filters are expected to manage their own
state
• Filters are expected to take care of the
buckets and brigades they are connected
to
• Filters are expected to obey “special”
buckets, like flush or EOS (end of stream)
mod_perl to the rescue!
• mod_perl provides an alternate “stream
oriented” filter scheme, for the weak of
heart
• In this scheme, two methods, read() and
print() do the bulk of the work
• mod_perl manipulates the bucket brigades
for us behind the scenes
Stream oriented filter sample
#file:MyApache2/FilterReverse1.pm
package MyApache2::FilterReverse1;
use strict;
use warnings;
use base qw(Apache2::Filter);
use Apache2::Const -compile => qw(OK);
use constant BUFF_LEN => 1024;
sub handler : FilterRequestHandler {
my $f = shift;
while ($f->read(my $buffer, BUFF_LEN)) {
for (split "\n", $buffer) {
$f->print(scalar reverse $_);
$f->print("\n");
}
}
Apache2::Const::OK;
}
1;
Setup in httpd.conf
# Don’t forget to AddModule perl_module modules/mod_perl.so
PerlModule MyApache2::Hello
PerlModule MyApache2::FilterReverse1
<Location /hello>
SetHandler modperl
PerlResponseHandler MyApache2::Hello
PerlOutputFilterHandler MyApache2::FilterReverse1
</Location>
Response
!dlroW lrep_dom/2ehcapA ,olleH
Summary
• Apache 2.0 provides a very rich API
• Handlers to customize various phases of the
connection and request cycles
• I/O filters for on-the-fly data modification, both in
and out of Apache
• mod_perl exposes this API for us, and in many
ways tries to simplify it and make it feel
somewhat more “natural” for Perl programmers
For more information…
• perl.apache.org
• [email protected]
• Several books authored by mod_perl’s
authors published O’Reilly (Some of which
may be auctioned tomorrow!)
Thank You!
For more information:
Issac Goldstand
[email protected]
http://www.beamartyr.net/
http://www.mirimar.net/