[WEB4LIB] RE: Seattletimes.com: Public to taste life without its libraries

Andrew I. Mutch amutch at waterford.lib.mi.us
Tue Aug 20 21:33:19 EDT 2002


Whether Google or other sites cache SPL's pages really doesn't seem to
matter. How many visitors to SPL's web site are going to get there
directly through Google's caching feature? I feel safe betting very few.
In all likelihood, whether through links, Google or some other route,
they'll follow a link that will either 404 or more likely display some
page explaining why the site is not available. At that point, the educated
searcher may go through Google or Alexa or some other cache service and
find the page they want since it appears much of SPL's pages are static.
But SPL's point has been made and the visitor will realize that SPL's
content is only available because someone has cached it. I doubt SPL is
going to want Google to dump their site from the cache when the library
site will be back up shortly. 

Andrew Mutch
Library Systems Technician
Waterford Township Public Library
Waterford, MI


On Tue, 20 Aug 2002, gary wrote:

> Nancy:
> Google captures a copy of each page* it finds during its crawl and makes the page 
> available via the Google Cache. If a web site owner doesn't want a page(s) 
> cached, The Washington Post as an example, Google needs to be contacted or the 
> proper file needs to be placed on the server. 
> 
> At the moment, Google has approx. 1300 pages from the www.spl.org domain in its 
> database.  Google Search: <inurl:www.spl.org -inurl:uk). 
> 
> I browsed through a few pages of results and all had cached versions available. 
> 
> So, will SPL ask them to purge the cache? It's a good question.
>  
> Another question, if a web searcher were to access the page with links 
> to remotely accessible subscription databases via SPL 
> (http://www.spl.org/selectedsites/subscriptions.html), will these links be 
> disconnected?
> 
> 
> Finally, other search engines are caching pages. The very new Gigablast also 
> caches content. http://www.gigablast.com
> Example of Cache:
> http://www.gigablast.com/cgi/0.cgi?n=10&ns=2&sd=0&q=%22seattle+public+library%22
> 
> 
> *Google crawls and caches the first 110k of a web pages. If a page is longer, 
> it's truncated at the 110 mark.  According to Greg Notess, Google truncates most 
> pdf files at "about 120k". http://www.searchengineshowdown.com/new.shtml#may18  
> 
> cheers,
> gary
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Quoting Nancy Sosna Bohm <plum at ulink.net>:
> 
> > Without knowing off-hand anything about the programming behind Google's
> > 'cached page' feature, I am now wondering if the Seattle public libraries'
> > pages would be viewable via Google, and if so, if Google would be considered
> > in such an instance to be comparable to a 'scab worker' during a strike?
> > 
> > ----- Original Message -----
> > From: "Karen G. Schneider" <kgs at bluehighways.com>
> > >...it is imperative
> > > that the Web site be "shuttered."  Anything less is misleading and
> > > unfair to both your staff and your community....
> > 
> > ----- Original Message -----
> > From: <jwang_94121 at yahoo.com>
> > > ...When Seattle public libraries go dark for a week next Monday, the
> > closure will have far more chilling implications than a late-summer
> > "furlough" might on the surface suggest. For a variety of...
> > >
> > > Full story:
> > http://archives.seattletimes.nwsource.com/cgi-bin/texis/web/vortex/display?s
> > lug=paul19&date=20020819
> > > ...
> > 
> > 
> > 
> > 
> 
> 
> -- 
> 
> Gary D. Price, MLIS
> Librarian
> Gary Price Library Research and Internet Consulting
> gary at freepint.com
> 
> The Virtual Acquisition Shelf and News Desk 
> http://resourceshelf.freepint.com
> 




More information about the Web4lib mailing list