[WEB4LIB] RE: re webmaster address on web site

Rich Kulawiec rsk at magpage.com
Fri Aug 9 21:35:50 EDT 2002


On Fri, Aug 09, 2002 at 10:07:35AM -0700, Keith Higgs wrote:
> After something just over a month of diligently processing
> unsubscribe/remove/etc... Thingies I am now down to about 80 per day
> from a high somewhere near 140. In that time I have also added myriad
> filtering rules to my email client to weed out repeat offenders.

I have some good news and some bad news.  (I spend a lot of time fighting
spam; I have for many years, including publishing what I think is the first
anti-spam program, "junkmail", on Usenet in the mid-1980's.  I sympathize
with your plight completely.)

The good news is that sophisticated tools exist to trap/filter/block/label
large chunks of spam.  I've put up a page that describes some of them here:

	http://www.firemountain.net/support/antispam.html

The first part is a general introduction; the list of tools and
other resources follows in the second part of the page.

In particular, you might want to look at SpamAssassin and Vipul's Razor;
you might also want to look at the SPEWS blacklist, whose effectiveness
is perhaps best measured by the steady parade of spammers who show up
in Usenet's news.abuse.net-admin.email whining about being blocked. ;-)

Some of the newer anti-spam tools (like SpamAssassin and Vipul's Razor)
use collaborative filtering and heuristics based on large accumulated
collections of spam.  The collaborative part is based on the observation
that if you and I and 500 other people all start getting the same message
at the same time from the same source, that (barring membership on common
mailing lists, which is accounted for) it's almost certainly spam.
The heuristic part is based on the observation that certain spamware
(aka "ratware") leaves characteristic signatures on messages, as
do certain spammers; additionally, certain text strings in combination
with certain other ones indicate a degree of "spamminess" in a message.
So do mismatches in timestamps, Message-IDs, originating IP and
claimed hostname, and so on.

These two strategies combine to provide two highly desirable features
in an anti-spam system: low false positive rate/high true negative rate.

One of the nice features of SpamAssassin is that it can be configured not
to block spam, but to pass it through with the addition of some lines
in the header like this:

	X-Spam-Status: No, hits=-4.2 required=5.0
		tests=IN_REP_TO,FROM_ENDS_IN_NUMS,UPPERCASE_25_50,PGP_SIGNATURE

which defers final decision-making to the end user's client, which in turn
means that users can tweak the threshold to what best suits them.

Now the bad news, point by point. ;-)

1. Never, ever, ever, ever reply to a spammer no matter how enticing
and how legitimate the "remove me" part of the message looks.
Experiments have been conducted using tagged addresses (and other
means).  They indicate that with exceedingly rare exceptions, you will
(a) confirm that your address is working and (b) being read and
therefore (c) you will get more spam from the same spammer -- albeit
from other domains and addresses and (d) you will raise the value
of your address on the lists which spammers sell to other
(known-working addresses have higher resale value) thus (e) helping
put extra money in the spammer's pocket.

There may be a considerable delay before (c) kicks in: spammers are
clever enough to wait a while so that it's harder to figure out the
cause and effect relationship.

2. Setting up your MTA (sendmail, postfix, etc.) with SpamAssassin and DNSBLs
like SPEWS and local blocking, etc., is definitely a non-trivial task.
However, there are an increasing number of how-to's and FAQs being written
to try to make it easier to deal with.  Note that nearly all of this presumes
you're running Unix/Linux on the mail server, mostly because the people
doing the development work tend to be doing that.  (There are tools for
other operating systems, though.)

3. Spammers are well aware of RFC 2142 (which requires some addresses and
recommends others) and you can be assured that the required addresses
mentioned in it will be spam targets whether or not you mention
them on a web apge.  (It's worth noting that while it's rather
self-destructive for spammers to hit addresses of the form
"abuse@", they do it every day.)  However, you don't want to disable
those addresses because (among other ensuing bad things) you'll end up here:

	http://rfc-ignorant.org/

4. A number of ISPs have decided that they're perfectly happy to accept
spammers' money and continue to host them -- knowing full well that they're
using the ISP to send millions upon millions of spams.  Run "pink contract"
through your favorite search engine and you'll see what I mean.  It's
also worth noting that other ISPs, while not having been caught yet
red-handed with such a contract, simply stonewall complaints for years
at a time while continuing to cash spammers' checks.  (The UUNet abuse
desk auto-issued complaint ticket number 1,000,000 this year.)

A highly useful resource for tracking this -- in fact, it's used by the FTC
when tracking down scams -- is the Spamhaus Project; in particular, the
Register of Known Spam Operations, ROKSO.  It's here:

	http://www.spamhaus.org/rokso/

5. A number of supposedly-reputable companies have either spammed or
have hired spammers to do it for them.  For instance, the software company
DBase last month hired notorious pornography spammers WorldReach, and
then had the audacity to claim that their "research" into Worldreach
hadn't revealed any problems.

To read about DBase's adventures, go to Google Groups and search
for "dbase.com spam".  To find out in 30 seconds what Dbase's "research"
on Worldreach failed to reveal, try searching for "worldreach spam".
To read about the growing use of spam by major companies, try searching
for "mainsleaze", which is the slang term used to describe this practice.

6. Good luck.  It's a tough fight trying to deflect the spam while
not impairing your ability to actually use email.  And (according
to several estimates) it's gotten much worth in 2002, with the figure
of a 600% increase in spam being the most oft-cited.  But it can be done.

---Rsk



More information about the Web4lib mailing list