Lee Tien of the Electronic Frontier Foundation and of the PUB (our local hangout) posed a challenge to me after I asked him for an un-paid internship at EFF. The challenge is to think of a way to increase marginal costs on people who sent spam emails. Here's the situation: since there is no marginal cost to sending spam (each additional email they send out doesn't cost them anything in terms of cash or time), they would overwhelm the carrying capacity of the internet, slowing down legitimate information exchange by crowding the internet with psuedo viagra ads and pathetic porn come-ons (I mean, can't they at least be funny or creative?). Recently, AOL said they had to filter A BILLION pieces of spam in a single day!
Spam is a tough problem to solve. Some have suggested government involvement by legislating spam. A recent proposal in California would require spammers to put "ADV:" in the subject field of all email advertisements. Another solution is to have people use spam filters at the receiving end to sort out the junk. There is a major problem with these two solutions. Basically, it doesn't solve the "crowding out the network" problem related to spam's zero marginal cost.
My idea is simple. It includes a tiny bit of consumer activism and a centrally located database. Recipients of what they consider to be spam would forward messages to this database. The database would use a Bayesian filter to decide whether the message being forwarded was indeed spam or legitimate mail. At the centrally located database, volunteers would feed the system samples of legitimate mail and samples of spam mail to build up the initial confidence levels of the Bayesian filter. For more information on Bayesian filters, see Arnold Kling's recent explanation.
The system would also weight the number of messages originating from any particular email address. If the filter decides that the messages are indeed spam and enough recipients forward the message to statistically reject the null hypothesis (null: not spam), the system will publicize the email address as a "spammer". Now, this database would not deal with enforcement. It would simply be a database of spammer email addresses. It would be up to individual ISPs to enforce spammers. It would work because the cost of spamming and crowding the networks affects ISPs directly and they have the incentive to enforce. So in effect, you have a centrally located database that ISPs would use to displace spammers. There are three major advantages to such a system:
1. The more spam mail you send out, the more likely you'll get on the spammers list, thus increasing the marginal cost to each additional piece of spam email you send out. Caveat: consumers bear some reporting cost that is balanced by the reduction of Spam in their mailbox. It will take time to see if that cost/benefit creates a workable system.
2. Bayesian filtering and recipient reporting reduces the dictatorial nature of the current MAPS system. It replaces a autocratic system with a democratic one. It also gets the government out of the process (which many people would view as a good thing).
3. The system of enforcement would be robust and distributed. It would be in the interest of each ISP to remove people on the "spammers list" and therefore the costs and benefits are well assigned. ISPs cleaning up spammers is analagous to restaurants cleaning up after the dirty customers, it's a cost that they should be willing to shoulder.
Okay, there's a fourth important point:
4. If a non-profit, like the EFF created and maintained the system. It would add some balance to the electronic battle between the dark forces of Industry vs. the naive Consumer.
Now the harder problem... how do I go about getting that un-paid internship?