[Web4lib] More fun with google
Karen Coyle
kcoyle at kcoyle.net
Sat Jun 18 17:55:29 EDT 2005
If nothing else, Google provides hours of entertainment as we try to
understand what it's really doing. I think of it as our own Da Vinci code.
In any case, in reading through the Michigan agreement I came upon
section 4.5.2 on "Security...", where the contract states:
"Google shall implement technological measures (e.g. through use of
the robots.txt protocol) to restrict automated access to any portion of
the Google Digital Copy..."
First, does robots.txt actually *prevent* access? Couldn't someone
choose to ignore it? Has anyone ever taken someone to court for ignoring
a robots.txt command?
Second, it is ironic for a company that has made its fortune sucking up
the contents of other people's web sites that their own is almost
entirely covered by their "disallows." What is stated in the contract
appears to be Google's general practice:
(Google's robots.txt)
ser-agent: *
Allow: /searchhistory/
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Disallow: /catalog_list
Disallow: /news
Disallow: /nwshp
Disallow: /?
Disallow: /addurl/image?
Disallow: /pagead/
Disallow: /relpage/
Disallow: /sorry/
Disallow: /imgres
Disallow: /keyword/
Disallow: /u/
Disallow: /univ/
Disallow: /cobrand
Disallow: /custom
Disallow: /advanced_group_search
Disallow: /advanced_search
Disallow: /googlesite
Disallow: /preferences
Disallow: /setprefs
Disallow: /swr
Disallow: /url
Disallow: /wml?
Disallow: /xhtml?
Disallow: /imode?
Disallow: /jsky?
Disallow: /pda?
Disallow: /sprint_xhtml
Disallow: /sprint_wml
Disallow: /pqa
Disallow: /palm
Disallow: /hws
Disallow: /bsd?
Disallow: /linux?
Disallow: /mac?
Disallow: /microsoft?
Disallow: /unclesam?
Disallow: /answers/search?q=
Disallow: /local?
Disallow: /local_url
Disallow: /froogle?
Disallow: /froogle_
Disallow: /print?
Disallow: /scholar?
Disallow: /complete
Disallow: /sponsoredlinks
Disallow: /videosearch?
Disallow: /videopreview?
Disallow: /videoprograminfo?
Disallow: /maps?
Disallow: /translate?
Disallow: /ie?
--
-----------------------------------
Karen Coyle / Digital Library Consultant
kcoyle at kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------
More information about the Web4lib
mailing list