Hiding draft pages from browsers, search engines
Web Publishers Virtual Library
arnett at alink.net
Sun Mar 22 11:11:18 EST 1998
At 06:53 AM 3/22/98 -0800, morganj at iupui.edu wrote:
>However, if there is an index or home file is there any way to
>force a browser to bypass it and list the files in the directory?
Generally not. Since this is a security issue, I'd hesitate to say it is
flat-out impossible, but via the Web, it probably is. If you're also
running ftp or gopher, this is not true for them.
>Secondly, can these "hidden" files be indexed by search engines?
If you mean search engines that are indexing via the Web, they won't find
those files unless there's a link to them. I am not aware of any robot
that even tries to retrieve directories unless there is an explicit link to
them. This means that even if you don't have a default page for the
directory, a search engine robot probably won't find the directory listing
unless there is a link explicitly pointing to it. However, a locally
running robot that uses the file system, rather than the Web, typically
would find them.
> However can search engines be set to ignore the index
>and home files and index all files in a directory on a remote web server?
There's really no standard for including files and directories in robot
directives. There is only a de facto standard for excluding them
(robots.txt). See
http://info.webcrawler.com/mak/projects/robots/norobots.html
>This came to mind when I contemplated having a local email directory, and
>began to think about how to make it available to individuals within the
>library but not to email spammers search engines.
You could use robots.txt to exclude the entire directory from
*well-behaved* robots. However, it's simple to create a robot that ignores
robots.txt, so this might not work. Many Web servers allow you to limit
access to a directory by IP address. That's probably the thing to do --
only allow your own institution's addresses to access the email directory.
Nick
More information about the Web4lib
mailing list