[Web4lib] htaccess to block multiple IP addresses
John Hubbard
hubbardj at uwm.edu
Thu Jun 23 09:57:10 EDT 2005
Hi Robert,
The syntax looks fine. You can add your own IP address to test it.
Taking off the last digits will ban larger block classes:
deny from 220
deny from 61.3.204
for example.
http://dmoz.org/Computers/Internet/Web_Design_and_Development/Authoring/FAQs,_Help,_and_Tutorials/Access_Control/
has some more reading if you're interested in doing anything fancier.
Other methods may be useful if you're having problems with a specific
bot. You can block crawlers with a specific (reported) http_user_agent
-- like "Googlebot" -- for example, using RewriteEngine or SetEnvIfNoCase.
Last week there was a posing that suggested putting traps in robots.txt
to track "nasty" robots.
http://lists.webjunction.org/wjlists/web4lib/2005-June/037486.html
HTH,
- John
--
John Hubbard
Electronic Resources Librarian
University of Wisconsin-Milwaukee
414-229-6775
VanderHart, Robert wrote:
> I'm trying to block bots that don't obey the robots.txt directives.
> We're using Linux/Apache 1.3. I was wondering if the following syntax
> is correct in an .htaccess file in our root directory:
>
> <Limit GET POST>
> order allow,deny
> allow from all
> deny from 213.239.236.18
> </Limit>
>
> <Limit GET POST>
> order allow,deny
> allow from all
> deny from 65.19.150.232
> </Limit>
>
> 1) Is this the only way to block several IP addresses from different
> ranges?
>
> 2) Is there a more efficient way to block nasty robots?
>
> Robert Vander Hart
> Electronic Resources Librarian
> Lamar Soutter Library
> University of Massachusetts Medical School
> Worcester MA 01655
>
> Voice: 508-856-3290
> Email: Robert.VanderHart at umassmed.edu
> Web: http://library.umassmed.edu
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
More information about the Web4lib
mailing list