[WEB4LIB] multiple webserver visits by unidentified robot
Nick Arnett
listbot at mccmedia.com
Wed May 5 13:11:15 EDT 1999
At 09:41 AM 5/5/99 -0700, Grace Garbe wrote:
>My webserver logs show that an unidentified robot visited my server on 24
>April and requested the robots.txt file a total of 24 times within a
>twelve hour period. The requests began at approximately 1:00 am and were
>repeated every thirty mintues for the next 12 hours. I have tried to find
>out to whom the IP address is registered (209.67.247.153) using UNIX
>nslookup and using D. Richard Dowdy's Internet Address
>Utility (http://www.world-net.net/cgi/rdowdy/netutil). Neither utility
>can ID this address. I am able to ping it and that's all.
>
>I have 3 questions:
>1. Why would a robot exhibit this kind of activity?
>2. Does anyone recognize this IP address?
>3. I am going to edit my robots.txt file to attempt to exclude this
>robot. Can that be done just by using the IP address since I don't know
>the robots name?
Traceroute shows that it is at Exodus, which is a huge web hosting
service. This means that it could belong to a large search engine, but
doesn't tell which Exodus customer it is. Checking robots.txt is usually
the first thing a web spider does before retrieving pages, but you seem to
imply that that's all it is doing, which seems odd. It's not retrieving
any other pages?
Exclusion by IP address should be done with your Web server, not the robots
exclusion file.
I've copied your message to the robots list, which I host
(http://www.mccmedia.com/html/discussion.html), to see what else I can find
out. Most of the people who create web robots subscribe.
Nick
More information about the Web4lib
mailing list