[WEB4LIB] RE: 404s (was Questia, Google, bricks & mortar)

Walt_Crawford at notes.rlg.org Walt_Crawford at notes.rlg.org
Thu Feb 1 18:12:45 EST 2001


>They found that 80% of all links were 404 within a few years of citation;
50% within six months.  I think that may >actually overstate the problem,
as Webmasters inevitably reorganize sites without necessarily invalidating
the >underlying content.

Traditionally (as in, "I don't have a reference handy,"), common wisdom is
that the half-life of a URL is 40 days (that is, after 41 days, half of a
random selection of URLs will yield 404s).

That suggests that undergrad papers do a whole lot better than the Web as a
whole.

I don't think the paper necessarily overstates the problem, as the problem
is that the reference no longer works; getting to the source document can
take a varying amount of time (from 2 minutes to infinity).

(This struck me because I'm finally writing up an article based on my
"study" of free electronic scholarly journals that were around in 1995. Of
the 104 journals in the ARL directory that met my criteria, 57 had URLs. Of
those 57, 17 were still live and leading to the same journal in 2001: 30%.
My comment in the article is that 30% after six years is a _good_ figure,
even for such "stable" items as journals.)

(On those same lines, the effect in Google of something like the CIC
Electronic Serials List disappearing or the Colorado Alliance serials list
disappearing is temporarily disastrous, for the 6-8 weeks before Google
refreshes its indexes: the first link in almost every case is dead. I speak
from hours of very recent experience...)

Walt Crawford



More information about the Web4lib mailing list