Open-source software refers to software programs for which the code is publicly available, and open to modification by anyone who cares to take a hack. For a problem like spam-fighting, these factors may turn out to be a huge advantage.

The key to SpamAssassin is its "rules." It looks at an e-mail and applies various rules to figure out whether it is likely to be spam. For example, there might be a rule that says "If the text 'Make Money Fast' appears, score ten points." (A total score of five points or higher earns an e-mail the spam label.) Rules range from simple ones, like the presence of specific text, to more complicated, geeky principles that analyze how and from where the message was sent.

New kinds of spam are being created all the time, of course, and coming up with new rules is a never-ending business. That's where having a large, distributed base of potential rule-makers is crucial. Say a hacker starts getting a lot of spam from Korean addresses; he writes a rule that identifies Korean spam, and adds it to SpamAssassin. Presto!

"The great thing about SpamAssassin is that it's so open," says Matt Sergeant, a key participant in the project. "Anyone can help by contributing rules, or spams that got missed, or helping us out with the main core bits of code. This gives us a huge advantage over some commercial alternatives such as Brightmail. The best part is that the people who work with SpamAssassin -- sysadmins -- are the ones who hate spam the most, and so are adamant to stop it. These people are tigers!"

"The open-source system really does help to generate a wide variety of rules," says Hughes. "It's the so-called million eyeballs principle ... lots and lots of people working together to solve a common problem."

Hughes notes that it's not just the rule-making that benefits from the process. SpamAssassin is designed to be modular: It's very easy to add a piece with specific functionality that takes advantage of whatever spam-fighting mechanism has been devised anywhere on the Net.

"Most other spam filters," says Hughes, "will do one thing. They'll search for text strings, or they will go look up things in the Realtime Blackhole List, or they'll calculate some kind of cryptographic checksum to see if it matches anything already known to be spam. But SpamAssassin tries to do everything. It's very easy to create a module that will connect to any kind of spam identification method that exists."

The Internet economy is littered with the corpses of companies that attempted to make a profit from open-source or free software. In 2002, it feels almost a little self-delusional to start enthusing about the commercial possibilities of code that anyone can hack on. But SpamAssassin may flourish because existing companies want to take advantage of its filtering engine for their own proprietary products, and in the meantime, are happy to return the favor by giving back some (though not all) improvements.

Matt Sergeant works for a company called MessageLabs that specializes in e-mail security. He says he was asked to develop an anti-spam detection engine to fit into the overall product.

"[In Oct. 2001] I searched around to see what was out there," says Sergeant, "and already back then SpamAssassin was the best open-source anti-spam solution bar none. So I took the code, and integrated it with our e-mail engine ... We're now seeing really great results with the combination of SpamAssasssin and my extensions."

Likewise, Hughes says he plans to keep plowing effort back into improving the open-source version of SpamAssassin, while holding onto the extensions that make it work with, say, Outlook, as the proprietary property of Deersoft. (SpamAssassin is licensed under the "Artistic Licence" devised by Lary Wall for the Perl scripting language, which essentially means that anybody can do pretty much anything they want with the code.)

It's a classic model for open-source software development: a group of parties coming together to collaborate on a common code-base that is of benefit to all -- not just to the individuals who want to fight spam, but also to companies with specific products that incorporate some aspect of spam-fighting.

But will it work in the long run? With the return of the World Birthday Web and the emergence of SpamAssassin, is hope finally on the horizon for those afflicted by junk e-mail? Or will the spammers just find a way around the latest technology, as they have found their away around every other previous obstacle? Already, I've observed that the amount of spam escaping SpamAssassin's clutches is rising. Sure, there's a newer version of the software that we need to install, but a sysadmin only has so many hours in the day.

Justin Mason, one of the original leaders of the SpamAssassin project, believes that the problem of spam is "never going to be addressable by pure technology. The spammers are human too, and will always put plenty of effort into defeating whatever filters are out there. As a result, it'll always be an 'arms race' between spammers and the filter developers and users, requiring frequent updating of filters ... With enough work from the anti-spam community (and sysadmins using anti-spam tools!) -- which seems to be forthcoming enough -- we can keep ahead and make their lives a whole lot harder. Fundamentally, though, no matter how hard we make it, I don't think we can 'defeat' spam."

Craig Hughes disagrees.

"I think it is [a winnable battle]," says Hughes, "and here's why. Ultimately spammers need to get a commercial message through that you will respond to. If they don't, there is no point in sending it. They could send you random gibberish, but they won't get any benefit. As long as there is a requirement that they need to make money or get a response I think we can construct ways of catching those messages, and distinguishing those messages.

"It's definitely an arms race -- our filters will get better, and spammers will get smarter. But the best spammer in the world has to beat us, and we only have to beat the average spammer. It's only a problem if too many messages get through, as long as we stay ahead of the average spammer, we're OK."

I want to believe Hughes, although I have a sneaking suspicion that a gallows or a guillotine might be the only technology that really has a hope of deterring spammers. And even the best filtering engine in the world does nothing to address the load that spam puts on the Internet's infrastructure -- the processing and bandwidth resources that it consumes. Dan Quinlan, a hacker hard at work combatting the rising flood of spam emanating from Korea, believes that only a three-pronged attack will work -- one that utilizes filtering, legislation and wholly new e-mail protocols that make spam more difficult.

But until Congress gets its act together, we're going to have to depend on the best the geeks can do. That's nothing to sniff at. Judging by SpamAssassin -- and the kind of will that says, by gum, I'm going to send out birthday greetings no matter how hard the spammers try to stop me -- the geeks aren't going down easy.

Recent Stories