[Web4lib] Faceted navigation as metasearch

Peter Noerr pnoerr at MuseGlobal.com
Fri Jan 5 16:25:52 EST 2007


> 
> As far as putting one of these
> interfaces on top of a federated search, I understand the 
> retrieval time problem remains an issue when more than a 
> handful of remote data sources are involved (network latency 
> alone, plus response time of each remote server).  
> 
> But
> it sounded to me like the vendors are aware of these issues, 
> and the search vendors are talking to the database vendors.  
> I envision a scenario where we can provide the illusion of 
> federated search across multiple bibliographically-fielded 
> databases, while actually only having to query a single 
> vendor-hosted service, and presenting a faceted-style result.  
>
Actually the end architecture is more likely to be the federated search
system on top (user interface side) of the databases. The fed search
systems already have the capability to combine results which the
database systems do not. The interesting advance here is firstly,
combining different "faceted organizations" of the data from different
databases (Genny's  point below), and then the presentation of those
refining/navigation/searching aids to the user in a non-confusing way,
but one that doesn't lose their power.  Working on it......



 
> As Peter from MuseGlobal noted,
> though, once you start bringing in all the unfielded data 
> from places like a general Google search, "faceted" means a 
> whole other thing, more like running some kind of semantic 
> content extraction against a full-text corpus.  It would 
> result in a different set of terms than those in the 
> controlled-vocabulary fields of a journal database or OPAC.  
> I don't know if you could then combine these results and 
> "facets" together in a display that would not mislead the 
> user to some degree.
> 
> I'm sure somebody's workin' on it though ...
 
The mixing (and relating) of controlled and uncontrolled vocabularies
has been a meaty dish from time immemorial (sort of). Adding the
federated aspect to the problem just expands the universe of discord (or
discourse) with no helpful side effects. And the results from static and
dynamic categorization (the term I prefer for this activity - leaving
'faceted analysis' and 'clustering' to their more traditional meanings)
do give very different results - of use to very different populations.
If you are a newbie to the topic then dynamic analysis shows the terms
in use and you can produce new searches which are much more productive.
For a subject experts the precision of the static vocabulary means
easier results refinement to what they "meant". Of course the writers of
the documents and the search engines have muddied this clarity a bit,
but nothing's perfect.

Peter
Dr Peter Noerr
CTO,MuseGlobal,Inc.
www.museglobal.com
+1 801 208 1880 


More information about the Web4lib mailing list