[Web4lib] Google Search Appliance and OPACs
Dale Askey
daskey at ksu.edu
Tue Apr 8 17:33:55 EDT 2008
While trolling web4lib archives today, I stumbled across the thread
that the message below generated. Fascinating reading, and well worth
taking the time to peruse.
I can offer a somewhat more targeted answer. Yes, we (K-State
Libraries) own a GB-1001 Google Search Appliance and have experimented
with indexing our catalog. We're not using http for this, but rather a
direct database crawl using a custom SQL query. The indexing piece
works fine, albeit with the caveat pointed out by Casey, namely, that
it's easy to exceed the licensed document maximum. Our GSA has a
500,000 document/db line maximum, and our ILS has somewhere around
2,000,000 lines. Our solution was to have it index a subset based on
date, number of circs, etc. All fine and good, but getting them to
display well in the results is a challenge we never resolved. It was
probably doable, but the other limitation pointed out in the thread,
namely, that Google's PageRank relies heavily on links, made it seem
not worth doing. Still, learned a bit about pulling stuff out of the
ILS via SQL and was happy to see it work.
More fruitful would be, of course, to use what Google calls One Box
technology, available on the GSA but not the Mini. It uses either a
trigger word in the query and/or a hidden field in the form to send
the query not only to its own index, but also to an external data
source. Search for your home phone number in Google, and you'll see
this in action. Results in public Google that come back at the top of
the list with little icons next to them (as with the phone number
example) are using One Box. I'd love to be able to use One Box to toss
keywords at our catalog and present at least a few book results along
with the site search results. We have yet to devote the resources to
do this, but in principle, it's fairly straightforward. First, one
must write some middleware that modifies the query into a state that
the target database can tolerate, and then take the returned results
and wrap them in a GSA-friendly XML schema. Sounds pretty easy in
theory, but we just haven't made it a priority with our limited
resources as yet.
Best regards,
Dale Askey
On Tue, Feb 5, 2008 at 11:25 AM, Gem Stone-Logan
<gemstonelogan at gmail.com> wrote:
> Out of curiosity, has anyone experimented with using the Google Search
> appliance for retrieving information from an ILS database? If so, what was
> your experience with it? I'm thinking of an implementation where Google
> retrieves the results but then points the user to specific OPAC records.
>
>
> Thanks,
>
> Gem Stone-Logan
> Weld Library District
> http://www.mylibrary.us/
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>
>
--
Dale Askey
Web Development Librarian
K-State Libraries
118 Hale Library
Manhattan, KS 66506
(785) 532-7672
More information about the Web4lib
mailing list