Email Anti-Spam & Anti-Virus at the UW

Updated 04/12/2004

Introduction

Computing & Communications provides the central email infrastructure for the University of Washington.  C&C IT Infrastructure (ITI) designs, manages, and maintains the component pieces which include incoming and outgoing mail exchangers (relays), IMAP servers for user inboxes, a web-mail client (WebPine), and the web-based Email Delivery Manager (EDM).  End user support is provided by C&C Client Services.

For anti-virus and anti-spam, C&C has deployed the PureMessage product from Sophos (originally ActiveState).  PureMessage is a flexible rules based system that also includes policy enforcement functionality. 

EDM allows users of the central mail servers to filter incoming messages at delivery time, and in particular, to set a threshold for spam scores, above which a message is placed in a designated spam folder and eventually auto-deleted.

Today

Architecture

Messages inbound to the vast majority of the University's email users are handled by a cluster of Linux systems running sendmail and PureMessage from Sophos.  There are currently 9 systems in that cluster. 

Outbound messages are handled by multiple clusters of Linux systems running either sendmail only or sendmail and PureMessage for virus scanning only.  The sendmail only clusters are for use by desktop mail clients, which then relay the message to the cluster running sendmail and PureMessage.

The current design calls for growing the inbound cluster as message volume reaches 125,000-150,000 messages per server per day.  That translates to adding one system to the cluster approximately every 3 months, which isn't viable as a long term solution (see Future below).

Mail Handling Policies

The prevalence of viruses that forge header and envelope info has lead us to silently discard all virus laden messages.  We are also silently blocking messages containing attachments with names that match a set of well-known viruses.  This is becoming a mail handling best practice and should be adopted by all entities.

In October 2003, we began blocking all messages from domains we identify as sending us primarily spam.  The sheer volume of spam that we were handling made it impractical to continue with our "tag and pass" approach and forced us to make descisions about the types of messages that the University would accept.  More information on our blocking can be found at http://www.washington.edu/computing/email/spamblock.html.  The blocking is implemented via PureMessage's internal blacklist functionality.

For spam from non-blocked domains, we add only an X-Header with the score information.  We do not do any subject rewriting or user-accessible quarantining for high score messages.  Once the message has been processed and a score assigned, it's sent on to its final destination.  Provided the destination is one of our Deskmail servers, users have the ability, via EDM, to filter messages to a junk-mail folder.  For departmental and non-UW forwarding destinations, filtering is left as an exercise for the user and/or departmental computing support staff.

In March 2004, to combat the increasing prevalence of new viruses, we began silently discarding messages that contained attachments with certain extensions, and for certain other extensions, we discard the attachments.  More details on this can be found on Attachment Block Information page.

Traffic Volumes

As of April 2004, the UW's central mail relays process nearly 950,000 inbound and 300,000 outbound messages a day.  On average, more than 1.5 million viruses a month are removed and 40% of all messages have a spam score greater than 50%, and more than 2.2 million messages are outright rejected from known spammers.  More detailed statistics can be found in the UW Virus Defense Log

PureMessage Config

To date, we've made no significant changes to the PureMessage configuration from the defaults and have developed no local spam rules.

A Bit of History

Email anti-virus efforts at the UW started initially with NAI/McAfee's WebShield product and protected only Nebula users.  In 2000, we expanded our anti-virus implementation to cover the centrally supported mail clusters.

In mid-2002, we began looking at possible solutions for countering the growing menace of Unsolicited Bulk Email (UBE), more commonly known as spam.  After looking at various possibilities, we decided to go with ActiveState's PureMessage (then PerlMx).  PureMessage was made attractive by the combination of cost, support, extensibility, ease of integration to our existing environment, and flexibility.

As we looked at PureMessage, we decided that it would be a good time to reevaluate our anti-virus implementation as well.  Our chief desires to simplify the architecture by reducing the number of hosts required, combining anti-spam and anti-virus on the same platform (RedHat only rather than RedHat and Windows), led us to consider PureMessage for anti-virus as well.

Our initial anti-virus testing with PureMessage used the VFind engine.  It was rapidly apparent that the VFind engine was too slow to meet our needs.  With PureMessage 3.0, ActiveState began offering the UVScan engine from McAfee as well.  The UVScan engine, while slower than our experience with WebShield, was sufficient to our needs.

PureMessage 3.0 was moved into production on January 1, 2003, for both anti-spam and anti-virus.  During the summer of 2003, we upgraded to PureMessage 4.0.0.

Prior to the upgrade to PureMessage 4.0.0, we replaced the message-body and attachments of virus-laden messages with standard text informing the recipient that the message had been removed due to a virus.  Sender info is available from the original message header which is left intact. 

Prior to late October 2003, a central design constraint has been that users must have control over the filtering process.  Unfortunately the continued increase in mail volumes and the steadily increasing percentage of that volume being spam, we concluded that we can no longer simply "tag and pass" email and started blocking mail from sources that sent us nothing or nearly nothing but spam.

Future

We are constantly evaluating our implementation for improvements, particularly in ways to decrease the amount of spam that we accept.