number of documents in the world wide web?

Peter Graham, Rutgers University Libraries psgraham at gandalf.rutgers.edu
Tue Oct 17 17:19:27 EDT 1995


Distinct Web pages:  You have to remember the certain small but noticeable
percentage of pages Lycos et al don't know about. My Web server, and I'm sure
others, allows me to prevent web-crawlers from investigating my pages.  I do
so; mainly because they clog up my log with multiple entries, particularly
the local Rutgers crawler that tries to run through my machine every night. 
I have identified at least 8 or 10 different crawlers that try to index my
server; I find it a nuisance.  

I suspect others do too (not to mention corporate or governmental
security-conscious locations) so the 10M number is likely to be an
underestimate of some small but noticeable amount.  --pg

Peter Graham    psgraham at gandalf.rutgers.edu    Rutgers University Libraries
169 College Ave., New Brunswick, NJ 08903   (908)445-5908; fax (908)445-5888
              <URL:http://aultnis.rutgers.edu/pghome.html>


More information about the Web4lib mailing list