[Web4lib] htaccess to block multiple IP addresses

John Hubbard hubbardj at uwm.edu
Thu Jun 23 09:57:10 EDT 2005


Hi Robert,

The syntax looks fine. You can add your own IP address to test it.

Taking off the last digits will ban larger block classes:

deny from 220
deny from 61.3.204

for example.

http://dmoz.org/Computers/Internet/Web_Design_and_Development/Authoring/FAQs,_Help,_and_Tutorials/Access_Control/
has some more reading if you're interested in doing anything fancier.

Other methods may be useful if you're having problems with a specific 
bot. You can block crawlers with a specific (reported) http_user_agent 
-- like "Googlebot" -- for example, using RewriteEngine or SetEnvIfNoCase.

Last week there was a posing that suggested putting traps in robots.txt 
to track "nasty" robots.
http://lists.webjunction.org/wjlists/web4lib/2005-June/037486.html

HTH,
- John

-- 
John Hubbard
Electronic Resources Librarian
University of Wisconsin-Milwaukee
414-229-6775




VanderHart, Robert wrote:
> I'm trying to block bots that don't obey the robots.txt directives.
> We're using Linux/Apache 1.3.  I was wondering if the following syntax
> is correct in an .htaccess file in our root directory:
> 
> <Limit GET POST>
>   order allow,deny
>   allow from all
>   deny from 213.239.236.18
> </Limit>
> 
> <Limit GET POST>
>   order allow,deny
>   allow from all
>   deny from 65.19.150.232
> </Limit>
> 
> 1) Is this the only way to block several IP addresses from different
> ranges?
> 
> 2) Is there a more efficient way to block nasty robots?
> 
> Robert Vander Hart
> Electronic Resources Librarian
> Lamar Soutter Library
> University of Massachusetts Medical School
> Worcester  MA  01655
> 
> Voice: 508-856-3290
> Email: Robert.VanderHart at umassmed.edu
> Web: http://library.umassmed.edu 
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/


More information about the Web4lib mailing list