mp2intro - beamartyr.net
Download
Report
Transcript mp2intro - beamartyr.net
Introduction to mod_perl
Issac Goldstand
Mirimar Networks
www.mirimar.net
[email protected]
mod_what?
• Apache module which exposes the entire
Apache API via Perl
• Allows us to write handlers and filters
• mod_perl is for Apache 1.3, mod_perl2 is
for Apache 2.x
CGI on Steroids™
• Classic known use of mod_perl is to simply
cache compiled Perl code in memory
• Don’t put this down: it’s been known to be up to
100 times faster than CGI scripts with mod_cgi
• (In mod_perl2, this is accomplished by using
ModPerl::Registry to handle all CGI scripts)
• But… That’s barely scratching the surface of
mod_perl’s abilities
What’s your handle?
• Apache defines a request chain which
each request must pass through
• Along this chain are various hooks
• A module defines a handler which
intercepts the request at a specific hook
and decides how to handle the request
I/O Filters
• Perl modules exist to capture STDOUT
stream
– Limited to STDOUT only
– Not the easiest interfaces in the world…
– Can only capture output from other Perl code
• Apache 2.0 introduces I/O filters for both
incoming requests and outgoing
responses.
Protocol Handlers
• Apache 2.0 is designed to be a more
generic server which can handle protocols
other than HTTP
• Simple implementation of SMTP exists in
Perl (Apache::SMTP)
• You can write your own custom protocols
• This is somewhat beyond the scope of this
lecture
PostReadRequest
• Hook that comes immediately after
request is read by Apache and headers
are processed
• Useful for preprocessing that needs to be
done once per request
• Example: Set special Perl variable $^T
($BASETIME) for use with –M –A and –C
filetests
Trans(lation)
• Used to manipulate (or “translate”) the
requested URI
• Many “standard” modules, such as mod_alias
and mod_rewrite use this heavily
• Useful to rewrite a “magic” URI into a more
standard query-string
• Example:
http://news.com/stories/27/02/2006/005.html
becomes http://news.com/cgibin/story?day=27&month=02&year=2006&story
=005
MapToStorage
• Used to map final URI to a physical file
• In Apache 1.3, this was part of Trans phase
• Normally, default behavior is what you’ll want
here – mod_alias will usually do anything fancy
you could want
• Useful for shaving some time off the request by
bypassing Apache’s attempt to find a physical
file if you have a custom response handler
HeaderParser
• Similar to PostReadRequest, except now
we have mapped the request to it’s final
<Location>, <Directory>, <File>, etc
containers
• Useful to determine if we need to do
anything special with the requests before
processing the heavy-lifting parts of the
request phase
Init
• Special “shortcut” handler that points to either
PostReadRequest or HeaderParser, depending
on it’s context
• In <Location> or other blocks that refer to a
specific URI, physical path or filename, it will be
HeaderParser
• Anywhere else, it will be PostReadRequest
• Regardless, it is guaranteed to be the first
handler called for it’s given context
Access
• First of the “AAA” series
• Used for access restrictions
– IP Based
– Time/Date based
– Other global, and “external” factors not
connected to the specific user
• Useful if we want to deny access to all
users based on certain criteria
Authentication (Authen)
• Second AAA handler
• Used to verify the credentials presented by a
user
– Perform login
– Validate existing session/ticket/cookie
• Only performed when the requested resource is
marked as password protected – meaning we’ll
need AuthName, AuthType and at least one
require directive (or their custom equivalents)
Authorization (Authz)
• Final AAA phase
• Decides whether the credentials validated
in the Authentication phase can access the
requested resource
• Closely linked to Authen phase, and will
only run after a successful Authen handler
Type
• Used to set the proper MIME type
(Content-Type) for the response
• Typically, also sets other type information,
such as content language
• Tends not to be overly used, as it’s an “all
or nothing” scenario, and things can still
be changed in the Fixup phase
Fixup
• Last chance to change things before the
response is generated
• A good chance to populate the environment with
variables (such as mod_env, and mod_ssl in
Apache 1)
• Useful for passing information in an arbitrary
manner to the response handler (which may be
an external process, such as with mod_cgi, or in
a non-apache specific language, such as
Registry scripts, PHP, Ruby, Python, etc)
Response
• The most popularly used handler phase
• Arguably the most important phase
• Used to generate and return the response
to the client
• We’ll use this handler when (time
permitting) we show an example
Log
• Just because we’ve satisfied the client doesn’t
mean our work is done!
• Used to log the information known about the
request, user and response
• Executed regardless of what happens in the
previous handlers
• Can be used to add custom information to the
logs (ads displayed, cookie values, etc)
Cleanup
• The last phase
• Actually, exists only in mod_perl and not in
Apache itself
• Happens after the client is disconnected, but
before request object is destroyed
• Used to cleanup anything after the request
(temp files, locks, etc)
• While the client won’t have to wait for this,
Apache will not re-queue the Perl interpreter
until it finishes!
Example: Hello World
• The moment we’ve all been waiting for!
#file:MyApache2/Hello.pm
package MyApache2::Hello;
use strict;
use warnings;
use Apache2::RequestRec ();
use Apache2::RequestIO ();
use Apache2::Const -compile => 'OK';
sub handler {
my $r = shift;
$r->content_type('text/plain');
$r->print('Hello, Apache2/mod_perl World!');
return Apache2::Const::OK;
}
1;
Preparing httpd.conf
# Don’t forget to AddModule perl_module modules/mod_perl.so
PerlModule MyApache2::Hello
<Location /hello>
SetHandler modperl
PerlResponseHandler MyApache2::Hello
</Location>
Response
Hello, Apache2/mod_perl World!
I/O Filters
• Used to make transformations to the
request and response without modifying
the “core” handler code
• Multiple filters can be used in conjunction
with one another
• Example: Content compression,
obfuscation, machine translation (eg,
English -> German)
Buckets and Bucket Brigades
• In the filter
implementation,
chunks of data, called
buckets, are passed
between the various
filters
• The pipeline of
buckets is called a
bucket brigade
Disclaimer
• I don’t like the picture on the next slide
• It’s the picture used on the mod_perl
website (and the most popular and only
useful picture in google images when
searching for bucket brigades)
• I personally find it to be somewhat
inaccurate, but since lots of other people
seem to find it helpful, we’ll look (or at
least glance) at it anyway
My humble version
From request handler
To network
OUTPUT
FILTER
Modifying data
From request handler
DA
TA
DA
TA
DA
TA
To network
AD
AT
OUTPUT
FILTER
Adding data
From request handler
DA
TA
DA
TA
DA
TA
To network
NEW
OUTPUT
FILTER
AD
AT
Removing data
From request handler
DA
TA
DA
TA
DA
TA
To network
NEW
OUTPUT
FILTER
AD
AT
Bucket brigade code sample
#file:MyApache2/InputRequestFilterLC.pm
package MyApache2::InputRequestFilterLC;
use strict;
use warnings;
use base qw(Apache2::Filter);
use Apache2::Connection ();
use APR::Brigade ();
use APR::Bucket ();
use Apache2::Const -compile => 'OK';
use APR::Const
-compile => ':common';
sub handler : FilterRequestHandler {
my ($f, $bb, $mode, $block, $readbytes) = @_;
my $c = $f->c;
my $bb_ctx = APR::Brigade->new($c->pool, $c->bucket_alloc);
my $rv = $f->next->get_brigade($bb_ctx, $mode, $block, $readbytes);
return $rv unless $rv == APR::Const::SUCCESS;
while (!$bb_ctx->is_empty) {
my $b = $bb_ctx->first;
if ($b->is_eos) {
$bb->insert_tail($b);
last;
}
my $len = $b->read(my $data);
$b = APR::Bucket->new($bb->bucket_alloc, lc $data) if $len;
$b->remove;
$bb->insert_tail($b);
}
Apache2::Const::OK;
}
1;
Looks fun, eh?
• Filters are expected to manage their own
state
• Filters are expected to take care of the
buckets and brigades they are connected
to
• Filters are expected to obey “special”
buckets, like flush or EOS (end of stream)
mod_perl to the rescue!
• mod_perl provides an alternate “stream
oriented” filter scheme, for the weak of
heart
• In this scheme, two methods, read() and
print() do the bulk of the work
• mod_perl manipulates the bucket brigades
for us behind the scenes
Stream oriented filter sample
#file:MyApache2/FilterReverse1.pm
package MyApache2::FilterReverse1;
use strict;
use warnings;
use base qw(Apache2::Filter);
use Apache2::Const -compile => qw(OK);
use constant BUFF_LEN => 1024;
sub handler : FilterRequestHandler {
my $f = shift;
while ($f->read(my $buffer, BUFF_LEN)) {
for (split "\n", $buffer) {
$f->print(scalar reverse $_);
$f->print("\n");
}
}
Apache2::Const::OK;
}
1;
Setup in httpd.conf
# Don’t forget to AddModule perl_module modules/mod_perl.so
PerlModule MyApache2::Hello
PerlModule MyApache2::FilterReverse1
<Location /hello>
SetHandler modperl
PerlResponseHandler MyApache2::Hello
PerlOutputFilterHandler MyApache2::FilterReverse1
</Location>
Response
!dlroW lrep_dom/2ehcapA ,olleH
Summary
• Apache 2.0 provides a very rich API
• Handlers to customize various phases of the
connection and request cycles
• I/O filters for on-the-fly data modification, both in
and out of Apache
• mod_perl exposes this API for us, and in many
ways tries to simplify it and make it feel
somewhat more “natural” for Perl programmers
For more information…
• perl.apache.org
• [email protected]
• Several books authored by mod_perl’s
authors published O’Reilly (Some of which
may be auctioned tomorrow!)
Thank You!
For more information:
Issac Goldstand
[email protected]
http://www.beamartyr.net/
http://www.mirimar.net/