[Web4lib] google & library catalogs
Binkley, Peter
Peter.Binkley at ualberta.ca
Wed Apr 12 13:34:09 EDT 2006
I'd never noticed before, but Google lists 377,000 pages from our opac:
http://www.google.com/search?hl=en&hs=pgN&lr=&client=firefox-a&rls=org.m
ozilla:en-US:official&q=site:ualweb.library.ualberta.ca
The ones I sampled were all searches by call number or issn, suggesting
that Google followed external links from our new books or ejournals
pages, but not internal links from the opac records themselves. Their
spidering algorithm must be smart enough to avoid getting lost in the
thickets of a highly-interlinked site with relatively few in-links.
The spider probably did attempt to follow external links in the opac,
since it has indexed 358,000 links to our EZProxy server (though a lot
of these, maybe all, could have come from elsewhere on our site). We'll
be replacing our ejournals' 856's with OpenURLs soon; it will be
interesting to see if Google starts trying to spider our resolver.
Currently our resolver has no hits in Google, but I'd have to check
whether there's a robots.txt that is keeping Google out.
Interestingly, the "Similar pages" link never comes up with anything. I
would have thought the metadata would provide a distinctive enough
fingerprint to pull up catalogue records for the same book at other
libraries; but perhaps all the institution-specific language on the page
muddies the waters too much.
Peter
Peter Binkley
Digital Initiatives Technology Librarian
Information Technology Services
4-30 Cameron Library
University of Alberta Libraries
Edmonton, Alberta
Canada T6G 2J8
Phone: (780) 492-3743
Fax: (780) 492-9243
e-mail: peter.binkley at ualberta.ca
-----Original Message-----
From: web4lib-bounces at webjunction.org
[mailto:web4lib-bounces at webjunction.org] On Behalf Of Joleen Crockett
Sent: Wednesday, April 12, 2006 12:01 AM
To: web4lib at webjunction.org
Subject: RE: [Web4lib] google & library catalogs
Not Google, but apparently Yahoo. Entering the catalog address as a
search (minus http://) brings up specific records from III
catalogs--especially those with a Kids catalog. Titles don't seem to
limited to Juvenile. Records often appear in search results on the
second or third page.
Joleen
Joleen Crockett
Adult Services Librarian
Tempe Public Library
Tempe,AZ
-----Original Message-----
From: web4lib-bounces at webjunction.org
[mailto:web4lib-bounces at webjunction.org]
On Behalf Of Sara Brownmiller
Sent: Monday, April 10, 2006 2:56 PM
To: web4lib at webjunction.org
Subject: [Web4lib] google & library catalogs
There is interest here in allowing google (google the search engine, not
google
scholar) to spider, or crawl, our library catalog. Since many students
start their research in google, they might identify information easily
available to them. It would also help increase exposure to materials in
our digital collections and our special collections and manuscripts.
Has anyone allowed a search engine to crawl their catalog? What impact
did it have on the performance? Does your library have a policy about
search engines crawling your catalog? What factors influenced your
decision?
I would also be very interested in locating some records in google that
came from a library catalog to see how the user is linked to the catalog
or to see how the material is identified with a specific institution.
thanks, Sara
Sara Brownmiller University of Oregon Libraries
Director, Library Systems 1299 University of Oregon
Women's Studies Librarian Eugene, OR 97403-1299
541/346-2368 (voice)
snb at uoregon.edu 541/346-3485 (fax)
_______________________________________________
Web4lib mailing list
Web4lib at webjunction.org
http://lists.webjunction.org/web4lib/
_______________________________________________
Web4lib mailing list
Web4lib at webjunction.org
http://lists.webjunction.org/web4lib/
More information about the Web4lib
mailing list