UNBlocked by Cyber Patrol

Jamie McCarthy jamie at mccarthy.org
Wed Dec 24 18:57:10 EST 1997


David Burt writes:

>The authors of "Blacklisted" have produced a list of 67 URLs that are
>said to be inappropriate blocks by the filtering product Cyber Patrol.
>Of these 67 blocks 58 were unblocked within 24 hours of notification,

I assume you haven't read

http://www.spectacle.org/cwp/problems.html
http://www.spectacle.org/cwp/preempts.html

The point is emphatically _not_ statistics, as anyone reading those
pages would understand.

>The authors claim they were only using a portion of the blacklist,

Yes, this is what I "claim" (on the fight-censorship mailing list).

And it's also "claimed" at <http://www.spectacle.org/cwp/gaysites.html>:
"There are over 50,000 entries in the Cyber Patrol database and most of
them have not been investigated."

>yet refuse to produce the list,

This canard again?  I would be sued down to my skivvies (and under the
just-passed law, I could go to prison).

I could try to arrange for the list to be anonymously mailed to David,
and then _he_ can produce it for us.  How about that?

>or to even estimate how large a portion they were using,

Nobody asked.

We looked through about 5,000 to 7,000 URLs with varying degrees of
care.  (Once we found our first hundred blocked sites, we stopped
looking as carefully and started skimming.  And, some volunteers did
a better job than others, though I'm not going to name names :-)

Lest David rework his figures using those numbers, let me repeat:  we
do not claim to have done an exhaustive search of any part of the
list.  Please don't make bad statistics worse by making assumptions.

>or to describe what methodology was used to review the list.

A lie.

David Burt asked about this at 8 AM Tuesday.  I talked about our
methodology at some length six hours later on the fight-censorship
mailing list, a post which was also Cc'd directly to David.  At 11 AM
the next day, he claims we "refuse...to describe" that methodology.

>Therefore, the 67 blocks cannot represented as a statistical sample
[snip]

That much is correct!  :-)  They cannot be represented as a
statistical sample.

Yet this is exactly what we see attempted below.

>Of these 67 blocks 58 were unblocked within 24 hours of notification

Slow down -- so you're saying there were 58 bad blocks found, including
one that wrongly blocked 1.4 million pages, and another two that wrongly
blocked over a hundred thousand pages?

>Of the remaining 9 blocks,

Ahem.  How about the 300 newsgroups that we pointed out were blocked?
Can you tell us, David, if soc.feminism is still blacklisted as
"SexActs" (a category that will be blocked by any public library that
uses the software)?  Is misc.health.injuries.rsi.moderated blocked as
SexActs as well?

Is rec.games.chess.analysis still blocked as gambling and "intolerance"
(maybe the words "black" and "white" appear too frequently)?

>4 are clearly selling pornography:
>
>1) http://www.instantaccess.com    Click on "visit the sites" button.

Huh?  No such button.

I scanned the site with some software I have access to, and was unable
to find anything that looked inappropriate amid hundreds of pages on
the site.  I found a _link_ to <http://www.xnetmag.com/>.  One link.

David Burt is smearing a website as "clearly selling pornography"
because, out of its hundred webpages which are actually selling software
games like flight simulators, it has _one link_ to an explicit site.

>2) http://www.drjack.com Advertises itself as "Dr Jack's Things is One Of
>Northern California's Hottest BBS's Specializing in Asian and  other Adult
>Images - Over 40,000 files online! 13 gigabytes"  
>http://www.drjack.com/djthing/

That's _one_ site on Dr. Jack's server.  The others include
<http://www.drjack.com/htmlview/>, some graphics viewing software which
received an excellent rating from TUCOWS and can be downloaded and
registered online (as long as you're not using Cyber Patrol, that is).

As we pointed out in http://www.spectacle.org/cwp/overbroad.html ,
there are varying levels of overbroad blocking.

Your constant apologias for software blocking other sites, even dozens
or thousands of other sites, based on allegedly explicit material on
different sites -- that makes our point for us.

>3) http://www.satisfaction.simplenet.com/    Appears to be a shut down
>porn site.

Based on what, exactly?  The word "satisfaction" too racy for you?

Or just the fact that Cyber Patrol lists it as porn?

This is one of the most insidious things about the software:  everything
it says is taken as gospel truth.  How many library patrons, faced with
a mysterious police-like badge forbidding them from seeing a site, will
go to the trouble (or have the nerve) to ask the librarian to allow them
access?

>4) http://phantom.datamg.com   Contains an "XXX network" at
> http://phantom.datamg.com/links.html

That's one link to a porn site on another network.  (Again, this is not
porn, there's no explicit material at the site.  David Burt is smearing
the website as "clearly selling pornography" because of a _link_.)

This is exactly what we discussed in our report at

   http://www.spectacle.org/cwp/overbroad.html

Is it fair to block the "Kids' Playground" at
<http://phantom.datamg.com/kids1.html> simply because there is _one_
link elsewhere on the site to material considered unsuitable?  Again,
you make the overblocking point for us.


Please note:

In short, what we have here is The Learning Company, _after_ being told
about 67 inappropriate web blocks in a very public manner, _continuing_
to block somewhere between four and nine of the sites at the IP level,
despite the fact that three of them have no explicit material on the
site, and that the fourth should be blocked at the directory level.

Now _this_ is a real news story.  Maybe we'll put up a webpage about it.


>Because the number of sites presented is one tenth of one percent of
>the total,

The number of _sites_ presented was well over 700,000.  In fact, the
number of blocked _webpages_ presented was approximately _seven_ tenths
of one percent OF THE WORLD WIDE WEB.  (There are about 200 million pages
on the web, 1.4 million of which were blocked at members.tripod.com in
every category Cyber Patrol has.)

Another way to put it is that, until today, Cyber Patrol was easily
blocking a hundred innocent webpages for each explicit one.  _Easily_.

Can you tell us what percentage of the entire web a product must wrongly
block before you'll cease recommending it?  One percent?  Two percent?
Ten percent?  Fifty percent?

What amount of overbroad blocking is acceptable?  One innocent page for
each ten guilty ones?  One for one?  Two for one?  A hundred for one?

>well within any reasonable persons margin of error,

If we tried this on David Burt's claims above, we'd get interesting
results.  He claimed that four sites were "clearly selling pornography."
Only one of them was (along with other, nonexplicit, material blocked
on the same site).  Is 75% mistakes a reasonable margin of error?

Do you see how ridiculous the "statistics" game is?

"Figures don't lie, but..."

>and the bad blocks were removed within 24 hours, Filtering Facts has
>no hesitations about continuing to recommend Cyber Patrol.

Just as Filtering Facts recommended Cyber Patrol two days ago, when
they wre blocking over a _million_ web pages at members.tripod.com.

What's still lurking in the list?

The point, as we state repeatedly, is not the sites themselves or the
fact that they were unblocked as we knew they would be.  The point is
the inherent flaws in the system that forces bad blocks to always be
present in censorware's secret blacklists.

And this point remains unaddressed.


--
 Jamie McCarthy                                     jamie at mccarthy.org
 http://www.absence.prismatix.com/jamie/




More information about the Web4lib mailing list