number of documents in the world wide web?
Peter Graham, Rutgers University
Libraries
psgraham at gandalf.rutgers.edu
Tue Oct 17 17:19:27 EDT 1995
Distinct Web pages: You have to remember the certain small but noticeable
percentage of pages Lycos et al don't know about. My Web server, and I'm sure
others, allows me to prevent web-crawlers from investigating my pages. I do
so; mainly because they clog up my log with multiple entries, particularly
the local Rutgers crawler that tries to run through my machine every night.
I have identified at least 8 or 10 different crawlers that try to index my
server; I find it a nuisance.
I suspect others do too (not to mention corporate or governmental
security-conscious locations) so the 10M number is likely to be an
underestimate of some small but noticeable amount. --pg
Peter Graham psgraham at gandalf.rutgers.edu Rutgers University Libraries
169 College Ave., New Brunswick, NJ 08903 (908)445-5908; fax (908)445-5888
<URL:http://aultnis.rutgers.edu/pghome.html>
More information about the Web4lib
mailing list