NAME

mod_perl_traps - common/known mod_perl traps

DESCRIPTION

In the CGI environment, the server starts a single external process (Perl interpreter) per HTTP request which runs single script in that process space. When the request is over, the process goes away everything is cleaned up and a fresh script is started for the next request. mod_perl brings Perl inside of the HTTP server not only for speedup of CGI scripts, but also for access to server functionality that CGI scripts do not and/or cannot have. Now that we're inside the server, each process will likely handle more than one Perl script and keep it "compiled" in memory for longer than a single HTTP request. This new location and longer lifetime of Perl execution brings with it some common traps. This document is here to tell you what they are and how to prevent them. The descriptions here are short, please consult the mod_perl FAQ for more detail. If you trip over something not documented here, please send a message to the mod_perl list.

Migrating from CGI

Apache::Registry

undefined subroutine &Apache::Registry::handler

Interaction with certain modules causes the shortcut configuration to break, if you see this message change your configuration from this:

 <Location /perl>
 PerlHandler Apache::Registry
 ...
 </Location>

To this:

 PerlModule Apache::Registry
 <Location /perl>
 PerlHandler Apache::Registry::handler
 ...
 </Location>

Using CGI.pm and CGI::*

Perl Modules and Extensions

Clashes with other Apache C modules

mod_auth_dbm

If you are a user of mod_auth_dbm or mod_auth_db, you may need to edit Perl's Config module. When Perl is configured it attempts to find libraries for ndbm, gdbm, db, etc., for the *DBM*_File modules. By default, these libraries are linked with Perl and remembered by the Config module. When mod_perl is configured with apache, the ExtUtils::Embed module returns these libraries to be linked with httpd so Perl extensions will work under mod_perl. However, the order in which these libraries are stored in Config.pm, may confuse mod_auth_db*. If mod_auth_db* does not work with mod_perl, take a look at this order with the following command:

 % perl -V:libs

If -lgdbm or -ldb is before -lndbm, example:

 libs='-lnet -lnsl_s -lgdbm -lndbm -ldb -ldld -lm -lc -lndir -lcrypt';

Edit Config.pm and move -lgdbm and -ldb to the end of the list. Here's how to find Config.pm:

 % perl -MConfig -e 'print "$Config{archlibexp}/Config.pm\n"'

Another solution for building Apache/mod_perl+mod_auth_dbm under Solaris is to remove the DBM and NDBM "emulation" from libgdbm.a. Seems Solaris already provides its own DBM and NDBM, and there's no reason to build GDBM with them (for us anyway).

In our Makefile for GDBM, we changed

  OBJS = $(DBM_OF) $(NDBM_OF) $(GDBM_OF)

to

  OBJS = $(GDBM_OF)

Rebuild libgdbm, then Apache/mod_perl.

REGULAR EXPRESSIONS

COMPILED REGULAR EXPRESSIONS

When using a regular expression that contains an interpolated Perl variable, if it is known that the variable (or variables) will not vary during the execution of the program, a standard optimization technique consists of adding the o modifier to the regexp pattern, to direct the compiler to build the internal table once, for the entire lifetime of the script, rather than every time the pattern is executed. Consider:

        my $pat = '^foo$'; # likely to be input from an HTML form field
        foreach( @list ) {
                print if /$pat/o;
        }

This is usually a big win in loops over lists, or when using grep or map.

In long-lived mod_perl scripts, however, this can pose a problem if the variable changes according to the invocation. The first invocation of a fresh httpd child will compile the table and perform the search correctly, however, all subsequent uses by the httpd child will continue to match the original pattern, regardless of the current contents of the Perl variables the pattern is dependent on. Your script will appear broken.

There are two solutions to this problem.

The first is to use eval q//, to force the code to be evaluated each time. Just make sure that the eval block covers the entire loop of processing, and not just the pattern match itself.

The above code fragment would be rewritten as:

        my $pat = '^foo$';
        eval q{
                foreach( @list ) {
                        print if /$pat/o;
                }
        }

Just saying

        eval q{ print if /$pat/o; };

is going to be a horribly expensive proposition.

You use this approach if you require more than one pattern match operator in a given section of code. If the section contains only one operator (be it an m// or s///), you can rely on the property of the null pattern, that reuses the last pattern seen. This leads to the second solution, which also eliminates the use of eval.

The above code fragment becomes:

        my $pat = '^foo$';
        "something" =~ /$pat/; # dummy match (MUST NOT FAIL!)
        foreach( @list ) {
                print if //;
        }

The only gotcha is that the dummy match that boots the regular expression engine must absolutely, positively succeed, otherwise the pattern will not be cached, and the // will match everything. If you can't count on fixed text to ensure the match succeeds, you have two possibilities.

If you can guaranteee that the pattern variable contains no meta-characters (things like *, +, ^, $...), you can use the dummy match:

        "$pat" =~ /\Q$pat\E/; # guaranteed if no meta-characters present

If there is a possibility that the pattern can contain meta-characters, you should search for the pattern or the unsearchable \377 character as follows:

        "\377" =~ /$pat|^[\377]$/; # guarenteed if meta-characters present

References

        The Camel Book, 2nd edition, p. 538 (p. 356 in the 1st edition).

AUTHORS

Doug MacEachern, with contributions from Jens Heunemann <heunemann2@janet.de>, David Landgren <david@landgren.net>, Mark Mills <mark@ntr.net>, Randal Schwartz <merlyn@stonehenge.com> and Ask Bjoern Hansen <ask@develooper.com>