NAME

Mail::DeliveryStatus::BounceParser - Perl extension to analyze bounce messages

SYNOPSIS

  use Mail::DeliveryStatus::BounceParser;
  my $bounce = eval { Mail::DeliveryStatus::BounceParser->new( \*STDIN | $fh | "entire\nmessage" | ["array","of","lines"] ) };
  if ($@) { } # couldn't parse.
  my       @addresses = $bounce->addresses;       # email address strings
  my         @reports = $bounce->reports;         # Mail::Header objects
  my $orig_message_id = $bounce->orig_message_id; # "<20030212182720.GI16472@vex.pobox.com>" string
  my $orig_message    = $bounce->orig_message;    # Mail::Internet object

ABSTRACT

Mail::DeliveryStatus::BounceParser analyzes RFC822 bounce messages and returns a structured description of the addresses that bounced and the reason they bounced; it also returns information about the original returned message including the Message-ID. It works best with RFC1892 delivery reports, but will gamely attempt to understand any bounce message no matter what MTA generated it.

DESCRIPTION

I wrote this for the Listbox v2 project; good mailing list managers handle bounce messages so listowners don't have to. The best mailing list managers figure out exactly what is going on with each subscriber so the appropriate action can be taken.

->new()

OPTIONS. If you pass BounceParser->new(..., {log=>sub { ... }}) That will be used as a logging callback.

NON-BOUNCES. If the message is recognizably a vacation autoresponse, or is a report of a transient nonfatal error, or a spam or virus autoresponse, you'll still get back a $bounce, but its $bounce->is_bounce() will return false.

It is possible that some bounces are not really bounces; for example, when Hotmail responds with 554 Transaction Failed, that just means hotmail was overloaded at the time, so the user actually isn't bouncing. To include such non-bounces in the reports, pass the option {report_non_bounces=>1}.

->reports()

Each $report returned by $bounce->reports() is basically a Mail::Header object with a few modifications. It includes the email address bouncing, and the reason for the bounce.

Consider an RFC1892 error report of the form

 Reporting-MTA: dns; hydrant.pobox.com
 Arrival-Date: Fri,  4 Oct 2002 16:49:32 -0400 (EDT)
 Final-Recipient: rfc822; bogus3@dumbo.pobox.com
 Action: failed
 Status: 5.0.0
 Diagnostic-Code: X-Postfix; host dumbo.pobox.com[208.210.125.24] said: 550
  <bogus3@dumbo.pobox.com>: Nonexistent Mailbox

Each "header" above is available through the usual get() mechanism.

  print $report->get('reporting_mta');   # 'some.host.com'
  print $report->get('arrival-date');    # 'Fri,  4 Oct 2002 16:49:32 -0400 (EDT)'
  print $report->get('final-recipient'); # 'rfc822; bogus3@dumbo.pobox.com'
  print $report->get('action');          # "failed"
  print $report->get('status');          # "5.0.0"
  print $report->get('diagnostic-code'); # X-Postfix; ...
  # BounceParser also inserts a few interpretations of its own:
  print $report->get('email');           # 'bogus3@dumbo.pobox.com'
  print $report->get('std_reason');      # 'user_unknown'
  print $report->get('reason');          # host [199.248.185.2] said: 550 5.1.1 unknown or illegal user: somebody@uss.com
  print $report->get('host');            # dumbo.pobox.com
  print $report->get('smtp_code');       # 550
  print $report->get('raw') ||           # the original unstructured text
        $report->as_string;              # the original   structured text

Probably the two most useful fields are "email" and "std_reason", the standardized reason. At this time BounceParser returns the following standardized reasons:

  user_unknown
  over_quota
  domain_error
  unknown
  no_problemo

(no_problemo will only appear if you set {report_non_bounces=>1})

If the bounce message is not structured according to RFC1892, BounceParser will still try to return as much information as it can; in particular, you can count on "email" and "std_reason" to be present.

->addresses()

Returns a list of the addresses which appear to be bouncing. Each member of the list is an email address string of the form 'foo@bar.com'.

->orig_message_id()

If possible, returns the message-id of the original message as a string.

->orig_message()

If the original message was included in the bounce, it'll be available here as a message/rfc822 MIME entity.

  my $orig_message    = $bounce->orig_message;

->orig_header()

If only the original headers were returned in the text/rfc822-headers chunk, they'll be available here as a Mail::Header entity.

->orig_text()

If the bounce message was poorly structured, the above two methods won't return anything --- instead, you get back a block of text that may or may not approximate the original message. No guarantees. Good luck.

CAVEATS

Bounce messages are generally meant to be read by humans, not computers. A poorly formatted bounce message may fool BounceParser into spreading its net too widely and returning email addresses that didn't actually bounce. Before you do anything with the email addresses you get back, confirm that it makes sense that they might be bouncing --- for example, it doesn't make sense for the sender of the original message to show up in the addresses list, but it could if the bounce message is sufficiently misformatted.

FREE-FLOATING ANXIETY

Some bizarre MTAs construct bounce messages using the original headers of the original message. If your application relies on the assumption that all Message-IDs are unique, you need to watch out for these MTAs and program defensively; before doing anything with the Message-ID of a bounce message, first confirm that you haven't already seen it; if you have, change it to something else that you make up on the spot, such as "<antibogus-TIMESTAMP-PID-COUNT@LOCALHOST>".

BUGS

I don't think I left any in. If you find any, I must have forgotten to take them out. Patches welcome.

BounceParser assumes a sanely constructed bounce message. Input from the real world may cause BounceParser to barf and die horribly when we violate one of MIME::Entity's assumptions; this is why you should always call it inside an eval { }.

TODO

Provide some translation of the SMTP and DSN error codes into English. Review RFC1891 and RFC1893.

KNOWN TO WORK WITH

We understand bounce messages generated by the following MTAs / organizations:

 postfix
 sendmail
 qmail
 Exim
 IMS
 Morgan Stanley (ms.com) and emory.edu
 AOL's AirMail sender-blocking
 Novell Groupwise

SEE ALSO

  Used by http://v2.listbox.com/ --- if you like
  BounceParser and you know it, consider Listbox for your
  mailing list needs!
  Ironically, BounceParser has no mailing list or web site
  at this time.
  See RFC1892, the Multipart/Report Content Type.

RANDOM OBSERVATION

Schwern's modules have the Alexandre Dumas property.

AUTHOR

Meng Weng Wong, <mengwong+bounceparser@pobox.com>

COPYRIGHT AND LICENSE

    Copyright (C) 2003 IC Group, Inc.
	pobox.com permanent email forwarding with spam filtering
      listbox.com mailing list services for announcements and discussion
   Meng Weng Wong <freeside>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

WITH A SHOUT OUT TO

  coraline, Fletch, TorgoX, mjd, a-mused, Masque, gbarr,
  sungo, dngor, and all the other hoopy froods on #perl