[Web4lib] Re: Google Search Appliance and OPACs

Jorge Serrano Cobos jorgeserrano at gmail.com
Thu Feb 7 04:57:21 EST 2008


The good and not so good things about GSA (Google Search Appliance)

The good (some of them):

- It is possible to develop a connector to permit GSA´s indexing of XMLMARC
catalog, and more:
http://code.google.com/apis/searchappliance/documentation/50/index.html

- Nice administration interface (I mean good looking, and easy to use)

- Some new features on their GSA labs quite useful, such as:
   - search log analysis (could be much better, but is a starting point),
   - you can define your own Keymatches, (Best Bets for most of us) even
your patrons can help to define them (Best Bets 2.0)
   - New Parametric search (for us Faceted Search) It is intended for
Intranets with metadata, so what else have more metadata than a library
catalog?
   - More, take a look here:
http://www.google.com/enterprise/labs/index.html

The not so good things:

- Like Cassey says, Google´s algorithm heart is Page Rank, what means links,
links, links. A transformation of relations between subject headings, tags,
whatever into links (XML) would be an advance in that way.

- For information retrieval purists, the use of Keymatches or Best Bets is
implicitly admitting that the algorythm is not perfect.

- But is perfectly normal. No "algo" can do its job if the content has no
content, and MARC has a lack of text (to calculate better keyword
density). As far as I know, tf-idf is at the core of Google, if you don´t
have links to help Page Rank. So information rich bibliographic records are
needed. The more content, the more possibilities for GSA to rank better.

- GSA is a black box. We just can infer how it works trying and trying, just
like SEO does. But they are constantly innovating, so I guess we just can
expect more and more good things, applied specifically to the different kind
of information that lies beneath intranets or catalogs. Because intranets
usually does not have links, and that´s probably the central issue. I guess
;-)

My two cents,

-- 
Jorge Serrano-Cobos
http//www.masmedios.com


On 2/7/08, Tim Spalding <tim at librarything.com> wrote:
>
> Has anyone tried just making a HUGE page of links and putting it
> somewhere Google will find it? Almost all OPACs allow direct links to
> records, by ISBN or something else. On a *few*—I've seen it on
> HiP—spidering this way causes serious sessions issues. (LibraryThing
> made this mistake once.) But it might be a way to get data into
> Google.
>
> Tim
>
> On 2/6/08, Peter Murray <peter at ohiolink.edu> wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > On Feb 5, 2008, at 11:25 AM, Gem Stone-Logan wrote:
> > > Out of curiosity, has anyone experimented with using the Google Search
> > > appliance for retrieving information from an ILS database?  If so,
> > > what was
> > > your experience with it?  I'm thinking of an implementation where
> > > Google
> > > retrieves the results but then points the user to specific OPAC
> > > records.
> >
> >
> > There was some work done in the NELLCO consortium in advance of their
> > IMLS Leadership Grant.  I think Northeastern U was the lead
> > institution, but I don't have any further details.
> >
> >
> > Peter
> > - --
> > Peter Murray                            http://www.pandc.org/peter/work/
> > Assistant Director, New Service Development  tel:+1-614-728-3600;ext=338
> > OhioLINK: the Ohio Library and Information Network        Columbus, Ohio
> > The Disruptive Library Technology Jester                http://dltj.org/
> > Attrib-Noncomm-Share   http://creativecommons.org/licenses/by-nc-sa/2.5/
> >
> >
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v1.4.5 (Darwin)
> >
> > iD8DBQFHqjoH4+t4qSfPIHIRAlG8AJwNBd1XTPK615QW5TBIjG7/ZGztjwCglHbr
> > m1Mf4Q62gsgurZDrJg9X5PQ=
> > =OIrz
> > -----END PGP SIGNATURE-----
> >
> > _______________________________________________
> > Web4lib mailing list
> > Web4lib at webjunction.org
> > http://lists.webjunction.org/web4lib/
> >
>
>
> --
> Check out my library at http://www.librarything.com/profile/timspalding
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>


More information about the Web4lib mailing list