[sponsored links]

qmail-localfilter -
qmail spam filtering perl scripts


Version: 1.55
Updated: 2006-09-04
Author: John Fitzgibbon
fitz@jfitz.com
http://www.jfitz.com


Introduction

qmail-localfilter is a perl-based spam-blocking/mail-filtering tool, originally derived from SpamAssassin. It was written to work with qmail, but can probably be adapted to other mail systems. The default rule-set is biased in favor of English-language mail systems.

One of the goals of the program is to simplify maintenance. All the settings, lists, rules and code can be found in two perl scripts. There are four additional files that can be created to customize settings or to add custom rules. If you understand basic perl, you should not have too much trouble adding simple rules.

qmail-localfilter does not "tag" mail - it rejects it with an SMTP 5xx code, (based on the filter's exit code). The reasons why a mail was rejected are logged to the mail log. There are also options to save rejected mail on the mailserver, either in a system-wide directory, or to user-specific directories. Saving spam to individual user directories offers the possibility of making the rejected mail available to users, (assuming they use something like an IMAP interface, based on Maildir-style mail folders).

For high-throughput systems, there is an option to finish execution of the tests early if sufficient evidence is gathered to identify the mail as spam. On my own mailserver, (a 1GHz Pentium), even with full analysis of all mail, most messages are filtered in about 1 second.

The package consists of 2 scripts:
  • qmail-localfilter.pl - the mail filtering script, (logic, rules, filters).
  • qmail-localfilter-settings.pl - site-specific settings, (variables, lists).
To download the latest production release, please see the Production Release History below. For upcoming release dates, or to download the latest development release, please see the Release Schedule. The code is distributed under a BSD-style license.

Requirements/Installation

To work with qmail, you'll need the following set-up in place first:
  • qmail with the QMAILQUEUE patch, (http://www.qmail.org/)
  • qmail-qfilter, (http://untroubled.org/qmail-qfilter/)
  • Perl, with Sys, Getopt, HTML, MIME, IO, Net, Expect and File packages, (uses Sys::Syslog, Getopt::Std, MIME::QuotedPrint, MIME::Base64, HTML::Entities, IO::Socket, IO::File, Expect, Net::DNS and File::Temp modules)
Once you have qmail-qfilter installed and working, and have tested qmail-localfilter.pl from the command line, copy qmail-localfilter.pl and qmail-localfilter-settings.pl into /var/qmail/bin, and set qmail-qfilter to call qmail-localfilter.pl as one of its filter programs.

You should also create a "qmail-localfilter-custom-settings.pl" file in /var/qmail/bin. You can include any system-wide custom settings in this file - see qmail-localfilter-settings.pl for a full description of the available settings.

More detailed installation and customization instructions can be found in the comments at the start of qmail-localfilter.pl

Upgrading

If you keep all your custom settings and rules in the specified "custom" files, you should be able to "drop-in" a new version of qmail-localfilter without "breaking" the existing set-up.

When downloading a new version, watch carefully for new Perl dependencies - it's a good idea to run the new version from the command line first to make sure that all required Perl modules are in place. You should also check the settings file for any new settings that need customizing.

Release Schedule

I aim to put out an updated release a few times a year. Intermediate releases may be made available in the event of a major bug, or an outbreak of spam or a virus that requires new filters. The next regular release is scheduled for April 2007.

If you don't want to wait until the next release, you can download the current development snapshot of the code here:

qmail-localfilter-1.56d.tgz

Often, the development scripts will have rules to handle new spam that isn't dealt with in a production release yet. These scripts represent the version of the code that I am running on my production mailserver right now, so they should be stable. However, as with any development release, use at your own risk!

Production Release History

2006-09-04: qmail-localfilter-1.55.tgz New rules
2006-04-17: qmail-localfilter-1.54.tgz New rules, virus checks
2005-12-27: qmail-localfilter-1.53.tgz New rules, better table checks
2005-09-22: qmail-localfilter-1.52.tgz New rules, fix slow rule on certain perl versions
2005-09-12: qmail-localfilter-1.51.tgz New rules, better table, div and header checks
2005-05-24: qmail-localfilter-1.50.tgz New rules, virus checks
2005-02-10: qmail-localfilter-1.49.tgz New rules, handles new yahoo mailserver naming
2004-12-10: qmail-localfilter-1.48.tgz New rules
2004-10-05: qmail-localfilter-1.47.tgz New rules
2004-09-10: qmail-localfilter-1.46.tgz New rules, virus checks
2004-07-25: qmail-localfilter-1.45.tgz New rules, virus checks, user-specific spam folders
2004-07-02: qmail-localfilter-1.44.tgz New rules, virus checks
2004-06-03: qmail-localfilter-1.43.tgz New rules, virus checks, better address checks
2004-05-04: qmail-localfilter-1.42.tgz New rules, better virus checks (reduces saved spam)
2004-04-30: qmail-localfilter-1.41.tgz New rules, options to save spam
(Note: As of v1.41, perl "File::Temp" and "IO::File" modules are required.)
2004-04-14: qmail-localfilter-1.40.tgz New rules, better content type decoding
2004-04-08: qmail-localfilter-1.39.tgz New rules
2004-03-09: qmail-localfilter-1.38.tgz New rules, virus check, better timeout, user settings
2004-03-01: qmail-localfilter-1.37.tgz New rules, virus check
2004-02-24: qmail-localfilter-1.36.tgz New rules, virus checks, better attachment handling
2004-02-17: qmail-localfilter-1.35.tgz New rules, virus check, better attachment handling
2004-01-30: qmail-localfilter-1.34.tgz Better MIME, HTML, bounce, attachment handling
2004-01-29: qmail-localfilter-1.33.tgz New rules, bug fix in returned mail tests
2004-01-28: qmail-localfilter-1.32.tgz New rules, bug fix, better virus bounce filters
2004-01-27: qmail-localfilter-1.31.tgz New rules, better filters for forged/virus bounces
2004-01-26: qmail-localfilter-1.30.tgz New rules, MyDoom virus test
2004-01-19: qmail-localfilter-1.29.tgz New rules, bagle virus test, logging for non-spam
2004-01-16: qmail-localfilter-1.28.tgz New rules, internal mail test, custom settings file
2004-01-12: qmail-localfilter-1.27.tgz New rules, better support for custom rules
2003-12-29: qmail-localfilter-1.26.tgz New rules, fixed HTTP link checking timeout bugs
(Note: As of v1.26, perl "Net" module is required for experimental MX checking.)
2003-11-22: qmail-localfilter-1.25.tgz New rules, added HTTP link checking
(Note: As of v1.25, perl "Expect" and "IO" modules are required for link checking.)
2003-11-09: qmail-localfilter-1.24.tgz New rules, better header decoding
2003-11-06: qmail-localfilter-1.23.tgz New rules
2003-09-09: qmail-localfilter-1.22.tgz New rules, added rule IDs and flexible scores
2003-08-29: qmail-localfilter-1.21.tgz New rules, better domain/server tests
2003-08-21: qmail-localfilter-1.20.tgz New rules
2003-08-19: qmail-localfilter-1.19.tgz New rules, better header decoding
2003-07-30: qmail-localfilter-1.18.tgz New rules, fixed text decoding, tests
2003-07-15: qmail-localfilter-1.17.tgz New rules, fixed address decoding bug
2003-06-26: qmail-localfilter-1.16.tgz Added code to scan start of attachments
2003-06-25: qmail-localfilter-1.15.tgz New rules
2003-06-11: qmail-localfilter-1.14.tgz New rules, better text "normalization"
2003-06-02: qmail-localfilter-1.13.tgz New rules, better attachment handling
2003-05-30: qmail-localfilter-1.12.tgz New rules, better encoded mail handling
2003-05-27: qmail-localfilter-1.11.tgz New rules
2003-05-23: qmail-localfilter-1.1.tgz Initial release


Questions and Answers

Q: Can I save spam so I can review it later?

As of version 1.41 there is an option to save spam. See "$save_spam" and related settings in qmail-localfilter-settings.pl for details. Also, as of version 1.45 there is an option to save spam to user-specific directories, instead of saving to a single system-wide default directory. The user-specific directories can be "Maildir-style" mail folders, which means that it is possible to drop spam directly into user-specific spam folders. Note that the options to save spam can only be set by the mailserver administrator - individual users cannot turn this option on if the administrator has not enabled it.

Q: Can I save spam without saving spam related to virus [Insert Virus Name Here]?

The option "$save_spam_exclude_list" can be used to prevent spam that meets certain criteria from being saved. "$save_spam_exclude_list" generally has a "sane" default value that will exclude some of the most obvious, (and annoying), viruses. You may find the need to modify this list, particularly if you are dealing with a new virus and have written your own custom rule to detect it.

Q: How do I create user-defined whitelists, blacklists, etc.?

As of version 1.38, qmail-localfilter supports user-defined settings files. The mail adminstrator must first make sure that the following settings are correct in the system-wide configuration file, qmail-localfilter-settings.pl, (default values are shown):

$allow_user_configuration = 1;
$users_home_path = "/home";
%mail_alias_list = ();

When "$allow_user_configuration = 1", individual users can create custom settings and rules files in a directory called ".qmail-localfilter". For example:

mkdir $HOME/.qmail-localfilter
pico $HOME/.qmail-localfilter/qmail-localfilter-custom-settings.pl

In the file $HOME/.qmail-localfilter/qmail-localfilter-custom-settings.pl you can place your custom whitelists, blacklists, etc. Please see the file qmail-localfilter-settings for details on the format of these lists. Note that if you want to extend a system-wide list, (rather than replace it), you can simply add the existing list at the start of the new one. For example:

$whitelist = "
$whitelist
myfriend\@somedomain.com
*\@friendlydomain.com
";

In addition to modifying the custom settings file, if you're comfortable with perl you can also add custom rules to the following files, (in $HOME/.qmail-localfilter):

qmail-localfilter-custom-body.pl
qmail-localfilter-custom-hdr.pl
qmail-localfilter-custom-raw.pl

Please see the comments at the start of qmail-localfilter.pl for details on how to write rules in these files.

Q: Can I use this as a client-side filter?

I believe that if you silently reject spam you should be able to use this program as a client-side qmail filter. Likewise, it should be possible to adapt this code for other (non-qmail) client filtering programs, provided they can recognise, and act upon, return codes. That said, I don't use the program for client-side filtering so I'm not the person to ask if you have trouble getting this to work, (sorry!).

Q: How do I silently reject spam?

Here are some notes taken directly from the qmail-qfilter site, (relevant piece is in bold):

Notes on writing a filter program:
- If you want to block an email, exit from the filter with code 31. This will cause qmail-qfilter to exit with the same error code, and qmail-smtpd (for example) to send an error code to the client.
- If you want to silently drop an email, exit with code 99.

The qmail-localfilter setting to change is "$spam_exit_code".

I've never used exit code 99 myself, but I presume it does what it says.

Q: Should I silently reject spam?

Before you rush off to implement silent rejection, you should be sure that this is what you really want to do. The usual reason quoted for silently blocking spam is a fear that otherwise you run the risk of generating bounce messages to forged "return path" addresses.

If you control your own mailserver, (i.e. you are not accepting mail via an ISP's relay), and you are exiting with code 31 before the mail is queued, YOU are NOT generating a bounce - your MTA is rejecting the mail and telling the upstream server that the mail will not be delivered. That upstream server may, (or may not), then generate it's own bounce message back to the sender. If the upstream server is a relaying mail server, (yeuch), it MAY send the bounce back along the (possibly fake) return path in the mail. If the upstream server is the originating server, then the bounce will most likely go to the real sender, not to some forged return path.

The bottom line is that YOU will never be responsible for sending bounces to random forged return paths. The only way a bounce to a forged address will occur is if the message is being relayed through a badly-configured open relay, in which case the bounce will come from the postmaster of the open relay, (who then deserves to be dragged out in the street and shot). This is why I mentioned "controlling your own mailserver" - if you relay (legitimately) through an ISP, they may be dumb enough to generate bad bounces, (in fact they may not have enough information to do anything else).

The good thing about having your server reject the message (with an SMTP 550 code, not a bounce) is twofold:
  • Spammers may eventually take you off their lists. Most spam comes directly from spammers' own machines, or machines they have hijacked, so they do register the 550's - and it is wasting their time and bandwidth too.
  • If you block a message from a legitimate sender, their own mailserver will send them a failure notification, (which is probably better than having it look like you are ignoring your friends).
None of this would apply if you were filtering after the mail has been queued. In this case you really would be generating the bounce, and the best information that could be used to send the bounce would be the return-path. In this situation, you probably would do best to swallow the mail and say no more.

Q: Can I filter forwarded mail?

To begin with, add the address you are forwarding from to your whitelist. Then, in your settings file, set:

$stop_on_spam = 0;
$spam_whitelist_bypass = 0;

The first setting means every mail will be processed to completion, regardless of whether it is spam or not. The second setting means that an address appearing in the whitelist won't cause the filtering to stop. Together, these settings mean every mail will go through all the filters with no shortcuts. (In case you're wondering whether these settings are safe or not, I use these settings myself because I like to see the extra information, and the additional processing overhead is rarely that much.)

These settings won't change the fact that the forwarded address will get a big negative score because we've added it to the whitelist, so we need to add a custom header rule to offset this:

if ($hdr_to =~ /forwardeduser\@somedomain\.com/)
{ spam("c00001",100.0, "Whitelist override: To: " . $&); }

(Replace the "forwardeduser\@somedomain\.com" bit with the appropriate address.) This rule takes advantage of the fact that the headers on forwarded mail are not changed. If your forwarder changes the headers, this approach may not work.

Because most forwared mail will not be from/to any of your normal mail domains, you may find that you need to adjust the score by less than 100, because the mail will already get a score for being from/to unknown domains. If you want to see what kind of score the forwarded mail is getting, set:

$log_passed_score = 1;

This will log mail that passes the tests to maillog. By checking the maillog you should be able to adjust the override score so that a "vanilla", non-spam, forwarded mail gets an overall score of zero.