Robots
tdowling at lib.washington.edu
tdowling at lib.washington.edu
Tue Sep 26 11:54:12 EDT 1995
http://info.webcrawler.com/mak/projects/robots/norobots.html
Note that the file you need is robots.txt, not robots.html.
Thomas Dowling
Networked Information Librarian, Public Services
University of Washington Libraries
tdowling at u.washington.edu
Note from: Mia Massicotte <MIAMASS at VAX2.CONCORDIA.CA>
Thu, 21 Sep 95 17:07:39 PDT----------------------------------------
% Walter W. Giesbrecht (walterg at yorku.ca) asked:
%
% >On another note: my access log notes about a dozen attempts to
% >retrieve a file called robots.txt from the root directory of the
% >server. Such a file has never existed here, and my colleagues who
% >manage other servers on campus have noticed the same thing. Several
% >of them have come from query2.lycos.cs.cmu.edu, which might be a
% >Lycos search & index attempt; others are not obvious. Any ideas?
%
% If I recall, robots.html is a file you include on your server if you do not
% want your server to be hit by a robot. The robot looks for such a file; if
it
% exists, the robot disregards your server for harvesting. There is an
% explanation of how robots work on the net somewhere, and this is explained.
% Sorry, I don't have the URL on hand; perhaps someone else does?
%
% Mia Massicotte, Systems Librarian
% Concordia University Library, Montreal, Quebec CANADA
% miamass at vax2.concordia.ca
More information about the Web4lib
mailing list