[WEB4LIB] multiple webserver visits by unidentified robot

Nick Arnett listbot at mccmedia.com
Wed May 5 13:11:15 EDT 1999


At 09:41 AM 5/5/99 -0700, Grace Garbe wrote:
>My webserver logs show that an unidentified robot visited my server on 24 
>April and  requested the robots.txt file a total of 24 times within a 
>twelve hour period.  The requests began at approximately 1:00 am and were 
>repeated every thirty mintues for the next 12 hours.  I have tried to find 
>out to whom the IP address is registered (209.67.247.153) using UNIX 
>nslookup and using D. Richard Dowdy's Internet Address 
>Utility  (http://www.world-net.net/cgi/rdowdy/netutil).  Neither utility 
>can ID this address.  I am able to ping it and that's all.
>
>I have 3 questions:
>1.  Why would a robot exhibit this kind of activity?
>2.  Does anyone recognize this IP address?
>3.  I am going to edit my robots.txt file to attempt to exclude this 
>robot.  Can that be done just by using the IP address since I don't know 
>the robots name?

Traceroute shows that it is at Exodus, which is a huge web hosting 
service.  This means that it could belong to a large search engine, but 
doesn't tell which Exodus customer it is.  Checking robots.txt is usually 
the first thing a web spider does before retrieving pages, but you seem to 
imply that that's all it is doing, which seems odd.  It's not retrieving 
any other pages?

Exclusion by IP address should be done with your Web server, not the robots 
exclusion file.

I've copied your message to the robots list, which I host 
(http://www.mccmedia.com/html/discussion.html), to see what else I can find 
out.  Most of the people who create web robots subscribe.

Nick


More information about the Web4lib mailing list