[Web4lib] More fun with google

Karen Coyle kcoyle at kcoyle.net
Sat Jun 18 17:55:29 EDT 2005


If nothing else, Google provides hours of entertainment as we try to 
understand what it's really doing. I think of it as our own Da Vinci code.

In any case, in reading through the Michigan agreement I came upon 
section 4.5.2 on "Security...", where the contract states:
  "Google shall implement technological measures (e.g. through use of 
the robots.txt protocol) to restrict automated access to any portion of 
the Google Digital Copy..."

First, does robots.txt actually *prevent* access? Couldn't someone 
choose to ignore it? Has anyone ever taken someone to court for ignoring 
a robots.txt command?

Second, it is ironic for a company that has made its fortune sucking up 
the contents of other people's web sites that their own is almost 
entirely covered by their "disallows." What is stated in the contract 
appears to be Google's general practice:
(Google's robots.txt)

ser-agent: *
Allow: /searchhistory/
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Disallow: /catalog_list
Disallow: /news
Disallow: /nwshp
Disallow: /?
Disallow: /addurl/image?
Disallow: /pagead/
Disallow: /relpage/
Disallow: /sorry/
Disallow: /imgres
Disallow: /keyword/
Disallow: /u/
Disallow: /univ/
Disallow: /cobrand
Disallow: /custom
Disallow: /advanced_group_search
Disallow: /advanced_search
Disallow: /googlesite
Disallow: /preferences
Disallow: /setprefs
Disallow: /swr
Disallow: /url
Disallow: /wml?
Disallow: /xhtml?
Disallow: /imode?
Disallow: /jsky?
Disallow: /pda?
Disallow: /sprint_xhtml
Disallow: /sprint_wml
Disallow: /pqa
Disallow: /palm
Disallow: /hws
Disallow: /bsd?
Disallow: /linux?
Disallow: /mac?
Disallow: /microsoft?
Disallow: /unclesam?
Disallow: /answers/search?q=
Disallow: /local?
Disallow: /local_url
Disallow: /froogle?
Disallow: /froogle_
Disallow: /print?
Disallow: /scholar?
Disallow: /complete
Disallow: /sponsoredlinks
Disallow: /videosearch?
Disallow: /videopreview?
Disallow: /videoprograminfo?
Disallow: /maps?
Disallow: /translate?
Disallow: /ie?


-- 
-----------------------------------
Karen Coyle / Digital Library Consultant
kcoyle at kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------



More information about the Web4lib mailing list