NAME
SYNOPSIS
DESCRIPTION
- Additional methods
- Types of server
SEE ALSO
BUGS
AUTHOR
HISTORY

NAME

Server - generic TCP/IP server class

SYNOPSIS

  use NetServer::Generic;

  my $server_cb = sub  {
                         my ($s) = shift ;
                         print STDOUT "Echo server: type bye to quit, exit ",
                                      "to kill the server.\n";
                         while (defined ($tmp = <STDIN>)) {
                             return if ($tmp =~ /^bye/i);
                             $s->quit() if ($tmp =~ /^exit/i);
                             print STDOUT "You said:>$tmp\n";
                         }
		       };
  my ($foo) = new NetServer::Generic;
  $foo->port(9000);
  $foo->callback($server_cb);
  $foo->mode("forking");
  print "Starting server\n";
  $foo->run();

DESCRIPTION

NetServer::Generic provides a (very) simple server daemon for TCP/IP processes. It is intended to free the programmer from having to think too hard about networking issues so that they can concentrate on doing something useful.

The NetServer::Generic object accepts the following methods, which configure various aspects of the new server:

port

The port to listen on.

hostname

The local address to bind to. If no address is specified, listens for any connection on the designated port.

listen

Queue size for listen.

proto

Protocol we're listening to (defaults to tcp)

timeout

Timeout value (see IO::Socket::INET)

allowed

list of IP addresses or hostnames that are explicitly allowed to connect to the server. If empty, the default policy is to allow connections from anyone not in the 'forbidden' list.

NOTE: IP addresses or hostnames may be specified as perl regular expressions; for example 154\.153\.4\..* matches any IP address beginning with '154.153.4.'; .*antipope\.org matches any hostname in the antipope.org domain.

forbidden

list of IP addresses or hostnames that are refused permission to connect to the server. If empty, the default policy is to refuse connections from anyone not in the 'allowed' list (unless the allowed list is empty, in which case anyone may connect).

callback

Coderef to a subroutine which handles incoming connections (called with one parameter -- a NetServer::Generic object which can be used to shut down the session).

mode

Can be one of forking, select, select_fast, client, threaded, or prefork.

By default, forking mode is selected.

forking mode is selected, the server handles requests by forking a child process to service them. If select mode is selected, the server uses the IO::Select class to implement a simple non-forking server.

The select-based server may block on i/o on a heavily-loaded system. If you need to do non-blocking i/o you should look at NetServer::FastSelect.

The client mode is special; it indicates that rather than sitting around waiting for an incoming connection, the server is itself a TCP/IP client. In client mode, hostname is the remote host to connect to and port is the remote port to open. The callback routine is used, as elsewhere, but it should be written as for a client -- i.e. it should issue a request or command, then read. An additional method exists for client mode: trigger. trigger expects a coderef as a parameter. This coderef is executed before the client-mode server spawns a child; if it returns a non-zero value the child is forked and opens a client connection to the target host, otherwise the server exits. The trigger method may be used to sleep for a random interval then return 1 (so that repeated clients are spawned at random intervals), or fork several children (on a one- time-only basis) then work as above (so that several clients poke at the target server on a random basis). The default trigger method returns 1 immediately the first time it is called, then returns 0 -- this means that the client makes a single connection to the target host, invokes the callback routine, then exits. (See the test examples which come with this module for examples of how to use client mode.)

Note that client mode relies on the fork() system call.

The threaded mode indicates that multithreading will be used to service requests. This feature requires Perl 5.005 or higher and a native threads library to run, so it's not 100% portable). Moreover, it's unreliable! Don't use this mode unless you're prepared to do some debugging.

The prefork mode indicates that the server will bind to the designated port, then fork repeatedly up to $start_servers times (where start_servers is a scalar parameter to NetServer::Generic). Each child then enters a select-based loop. (i.e. run_select), but exits after handling $server_lifespan transactions (where server_lifespan is another parameter to NetServer::Generic). Every time a child handles a transaction it writes its PID and generation number down a pipe to the parent process, with a message when it exits. The parent keeps track of how many servers are in use and fires up extra children (up to $max_servers) if the number in use leaves less than $min_spare_servers free. See the example preforked-shttpd for a minimal HTTP 0.9 server implemented using the prefork mode.

Of these, the callback method is most important; it specifies a reference to a subroutine which effectively does whatever the server does.

A callback subroutine is a normal Perl subroutine. It is invoked with STDIN and STDOUT attached to an IO::Socket::INET object, so that reads from STDIN get information from the client, and writes to STDOUT send information to the client. Note that both STDIN and STDOUT are unbuffered. In addition, a NetServer::Generic object is passed as an argument (but the callback is free to ignore it).

Your server reads and writes data via the socket as if it is the standard input and standard output filehandles; for example:

  while (defined ($tmp = <STDIN>)) {  # read a line from the socket

  print STDOUT "You said: $tmp\n";    # print something to the socket

(See IO::Handle and IO::Socket for more information on this.)

If you're not familiar with sockets, don't get too fresh and try to close or seek on STDIN or STDOUT; just treat them like a file.

The server object is not strictly necessary in the callback, but comes in handy: you can shut down the server completely by calling the quit() method.

When writing a callback subroutine, remember to define some condition under which you return!

Here's a slightly more complex server example:

 # minimal http server (HTTP/0.9):
 # this is a REALLY minimal HTTP server. It only understands GET
 # requests, does no header parsing whatsoever, and doesn't understand
 # relative addresses! Nor does it understand CGI scripts. And it ain't
 # suitable as a replacement for Apache (at least, not any time soon :).
 # The base directory for the server and the default
 # file name are defined in B<url_to_file()>, which maps URLs to
 # absolute pathnames. The server code itself is defined in the
 # closure B<$http>, which shows how simple it is to write a server
 # using this module.

 sub url_to_file($) {
   # for a given URL, turn it into an absolute pathname
   my ($u) = shift ;  # incoming URL fragment from GET request
   my ($f) = "";      # file pathname to return
   my ($htbase) = "/usr/local/etc/httpd/docs/";
   my ($htdefault) = "index.html";
   chop $u;
   if ($u eq "/") {
       $f = $htbase . $htdefault;
       return $f;
   } else {
       if ($u =~ m|^/.+|) {
           $f = $htbase;  chop $f;
           $f .= $u;
       } elsif ($u =~ m|[^/]+|) {
           $f = $htbase . $u;
       }
       if ($u =~ m|.+/$|) {
           $f .= $htdefault;
       }
       if ($f =~ /\.\./) {
           my (@path) = split("/", $f);
           my ($buff, $acc) = "";
           shift @path;
           while ($buff = shift @path) {
               my ($tmp) = shift @path;
               if ($tmp ne '..') {
                   unshift @path, $tmp;
                   $acc .= "/$buff";
               }
           }
           $f = $acc;
       }
   }
   return $f;
 }

 my ($http) = sub {
    my ($fh) = shift ;
    while (defined ($tmp = <STDIN>)) {
        chomp $tmp;
        if ($tmp =~ /^GET\s+(.*)$/i) {
            $getfile = $1;
            $getfile = url_to_file($getfile);
            print STDERR "Sending $getfile\n";
            my ($in) = new IO::File();
            if ($in->open("<$getfile") ) {
                $in->autoflush(1);
                print STDOUT "Content-type: text/html\n\n";
                while (defined ($line = <$in>)) {
                    print STDOUT $line;
                }
            } else {
                print STDOUT "404: File not found\n\n";
            }
        }
        return 0;
    }
 };                           

 # main program starts here

 my (%config) =  ("port"     => 9000, 
                  "callback" => $http, 
                  "hostname" => "public.antipope.org");

 my ($allowed) = ['.*antipope\.org', 
                  '.*localhost.*'];

 my ($forbidden) = [ '194\.205\.10\.2'];

 my ($foo) = new Server(%config); # create new http server bound to port 
                                  # 9000 of public.antipope.org
 $foo->allowed($allowed);         # who is allowed to connect to us
 $foo->forbidden($forbidden);     # who is refused access
 print "Starting http server on port 9000\n";
 $foo->run();                     
 exit 0;

Additional methods

NetServer::Generic provides a couple of extra methods.

peer()

The peer() method returns a reference to a two-element list containing the hostname and IP address of the host at the other end of the socket. If called before a connection has been received, its value will be undefined. (Don't try to assign values via peer unless you want to confuse the allowed/forbidden checking code!)

quit()

The quit() method attempts to shut down a server. If running as a forking service, it does so by sending a kill -15 to the parent process. If running as a select-based service it returns from run().

start_servers()

In prefork mode, specifies how many child servers to start up.

max_servers()

In prefork mode, specifies the maximum number of children to spawn under load.

min_spare_servers()

In prefork mode, specifies a number of spare (inactive) child servers; if we drop below this level (due to load), the parent will spawn additional children (up to a maximum of max_servers) until we go back over min_spare_servers.

server_lifespan()

In prefork server mode, child servers run as select servers. After server_lifespan connections they will commit suicide and be replaced by the parent. If server_lifespan is set to 1, children will effectively run once then exit (like a forking server). For purposes of insanity, a lifespan of 0 is treated like a lifespan of 1.

servername()

In the prefork server, unless you explicitly tell the server to bind to a named host, it will accept all incoming connections. Within a client, you may need to know what local IP address an incoming connection was intended for. The servername() method can be invoked within the child server's callback and returns a two-element arrayref containing the port and IP address that the connection came in on. For example, in the client:

  my $callback = sub {
    my $server = shift;
    my ($server_port, $server_addr) = @{ $server->servername() };
    print "Connection on $server_addr:$server_port\n";

Types of server

A full discussion of internet servers is well beyond the scope of this man page. Beginners may want to start with a source like Beginning Linux Programming (which provides a simple, lucid discussion); more advanced readers may find Stevens' Advanced Programming in the UNIX environment useful.

In general, on non-threaded systems, a forking server is slightly less efficient than a select-based server (and uses up lots of PIDs). On the other hand, a select-based server is not a good solution to high workloads or time-consuming processes such as providing an NNTP news feed to an online newsreader.

A major issue with the select-based server code in this release is that the IO::Select based server cannot know that a socket is ready until some data is received over it. (It calls can_read() to detect sockets waiting to be read from.) Thus, it is not suitable for writing servers like which emit status information without first reading a request.

BUGS

There are two bugs lurking in NetServer::Generic. Or maybe they're design flaws. I don't have time to fix them right now, but maybe you'd like to contribute an hour or two and get your name in the credits?

Bug the first:

NetServer::Generic attempts to make it easy to write a server by letting the programmer concentrate on reading from STDIN and writing to STDOUT. However, this form of i/o is line oriented. NetServer::Generic relies on the buffering and i/o capabilities provided by Perl and IO::Socket respectively. It doesn't buffer its own input.

This means that in principle a malicious attacker (or just a badly- written client program) can write a stream of bytes to a NetServer::Generic application and, as long as those bytes don't include a "\n", Perl will keep gobbling it up until it runs out of virtual memory.

This can be fixed by replacing the globbed IO::Socket::INET that is attached to STDIN with something else -- probably an object that presents itself as an IO::Stringy but that does its own buffering, so that it will return either a line, or some sort of error message in $! if it sees something undigestible in its input stream. (If anyone wants to contribute a patch that fixes this, please feel free; this is an open source project, after all ...)

Bug the second:

The select-based server was originally written because I wanted to share state information between some forking servers and I couldn't use System V shared memory (the application had to be portable to a flavour of UNIX that didn't support it).

It works okay, up to a point, but under heavy load on Linux it can run into major problems. Partly this may be attributable to deficiencies in the way Linux handles the select() system call (or so Stephen Tweedie keeps telling me), but the result is that the select-based server tends to drop some connections when it's under stress: if two connections come in while it's serving another, the first may never get processed before a timeout occurs.

A somewhat worse problem is that IO::Select doesn't do buffered (line- oriented) input; it just checks to see if one or more bytes are waiting to be read from one of the file handles it's got hold of. It is possible for a couple of bytes to come in (but not a whole line), so that the select-based server merrily tries to process a transaction and blocks until the rest of the input arrives -- thus ensuring that the server is bottlenecked by the speed of the slowest client connection.

Suggestion: if you need to serve lots of connections using select(), look at the eventserver module instead. If you're a bit more ambitious, the defect in NetServer::Generic is fixable by writing a module with a similar API to IO::Select, but which provides buffering for the file handles under its control and which only returns something in response to can_read() when one of the buffers has a complete line of input waiting.

AUTHOR

Charlie Stross (charle@antipope.org). With thanks for bugfixes and patches to Marius Kjeldahl marius@ace.funcom.com, Art Sackett asackett@artsackett.com, Claudio Garcia cgarcia@dbitech.com, Claudio Calvelli lunatic@assurdo.com, Martin Waite Martin.Waite@montgomery134.freeserve.co.uk. Debian package contributed by Jon Middleton, jjm@ixtab.org.uk.

HISTORY

Version 0.1

Based on the simple forking server in Chapter 10 of "Advanced Perl Programming" by Sriram Srinivasan, with a modular wrapper to make it easy to use and configure, and a rudimentary access control system.

Version 0.2

Added the peer() method to provide peer information.

Bugfix to ok_to_serve from Marius Kjeldahl marius@ace.funcom.com.

Added select-based server code, mode method to switch between forking and selection server modes.

Updated test code (should do something now!)

Added example: fortune server and client code.

Supports NetServer::SMTP (and, internally, NetServer::vTID).

Version 0.3

fixed test failure.

Version 1.0

Added alpha-ish prefork server mode.

Added alpha-ish multithreaded mode (UNSTABLE)

Modified IP address filtering to cope with regexps (suggested by Art Sackett asackett@artsackett.com)

Modified select() server to do non-blocking writes via a

Non-blocking-socket class tied to STDIN/STDOUT

Option to log new connection peer addresses via STDERR

Extra test scripts

Updated documentation

1.01

Fix so it works on installations with no threading support (duh). Tested on Solaris, too.

1.02

Bugfixes to the preforked mode (thanks to Art Sackett for detecting them). Bugfix to ok_to_serve() (thanks to Claudio Garcia, cgarcia@dbitech.com). Some notes on the two known bugs (related to buffering).

1.03

Signal handling code was fixed to avoid leaving zombie processes (thanks to Luis Munoz, lem@cantv.net)