Saturday, September 11, 2004

SpamAssassion 3.0 : The new seed

As i already said on my previous post, SPAM is cosidered as one of the most bandwidth consuming activity around the Internet world, because many junk emails sent from an unknown persons and came to many people's inbox without being asked first. People get annoyed with this and system administrator became very busy to tackle this junk email while sepparating them with the original email that should arrive in the user's inbox.

The old way to prevent spam was to delete and ignore spam emails, but these methods doesn't work anymore, since spammers is enhancing their method to proof that your email address is valid and they will try to sent you more junk mails. Sometimes they give a URL to confirming that you do not want to subscribe again, but this is just a trick. This URL will confirm that your email is a valid one and.... boom.. your email will soon became spammer's target.

Some people invented mail filtering to prevent junk or spam email gets into user's inbox by filtering the words or subjects in the email's body. One of the best email filtering is SpamAssassin, an OpenSource project that is widely used in many email servers around the world. SpamAssasin works by giving a point on each email. The higher the points, it will soon considered as spam. SpamAssassin has a learning method which will improve their filtering capability by adaptively learn the characteristics of spam. This will help system administrator to maintain spammers list on the SpamAssassin track records.

In the next one or two years, SpamAssassin Project will release their newest version, 3.0. Some of the next feature are (taken from

New Rules
Naturally, SpamAssassin 3.0.0 includes many new static rules, and changes the definitions and scores of several old ones to reflect the changing nature of spam. For example, many rules focused on pharmaceutical spam are now included--drugs seem to have caught up with mortgages and pornography in the distribution of spam.

Updated API
The SpamAssassin Perl API has been extensively rewritten; software that invokes SA from Perl, such as proxies or mail server filters, will require recoding. The two popular filter applications discussed in the SpamAssassin book, MIMEDefang and amavisd-new, have already been updated to support SA 3. Mailers that integrate with SpamAssassin by invoking the spamc program or communicating directly with the spamd daemon don't need to be rewritten.

More significantly, SA 3 now supports plugins--new modules of code that extend SA's capabilities. SA 3 is distributed with four working plugins:
* RelayCountry:
* Hashcash
* Sender Policy Framework

SQL and LDAP Support
With earlier versions of SpamAssassin, maintaining per-user configuration was difficult in virtual hosting environments when users did not have shell accounts on the mail server. SA 3 greatly eases this difficulty by allowing per-user preferences, Bayesian data, and auto-whitelists to be stored in an SQL database, rather than in files in users' home directories. Example SQL tables are included for both MySQL and PostgreSQL servers. Alternatively, per-user preferences can be stored in an LDAP database, which is a boon to sites that have LDAP-driven mail setups. Available preferences are more extensive, as well; for example, new directives allow email from or to given addresses to opt out of the Bayesian classifier. A pharmacist whose mail is on a host that uses a site-wide Bayesian database can now avoid having their legitimate email classified as spam because SA has learned from other users' mail that "sildenafil" is a spam token.

Internal Networks
SpamAssassin has always distinguished trusted networks from untrusted networks, and does not perform DNS-based blacklist testing on relays on trusted networks. SA 3 introduces the new concept of internal networks, which are not only trusted, but assumed to be under your direct control. By separating trusted and internal networks, SA 3 can do a better job at detecting spam originating directly from dialup hosts but still exempt trusted sites from blacklists.

SpamAssassin 3.0.0 is now an Apache Software Foundation project, and is released under the Apache Software License instead of the GNU General Public License or Perl Artistic License. As these licenses all fit the open source and free software definitions, most end users won't notice the difference. If you plan to redistribute SA 3 as part of a larger system, however, you should be aware that the Free Software Foundation does not consider the Apache Software License to be GPL-compatible, so you may run into difficulty if you wish to combine SA 3 with GPL code.

The Upgrade Experience
Although the upgrade process is fairly straightforward, you can expect a few gotchas with a new major release. Some command-line options have been deprecated or renamed. The extremely common required_hits and rewrite_subject configuration directives have been renamed to required_score and require_header (the latter capable of rewriting more than just the Subject header). Integration with Mail::Audit is no longer supported, and older Bayes database files may require special care to update to the new version.

Let's hope that this release will reduce more spam emails into our inbox :)

No comments:

Post a Comment