[WEB4LIB] RE: Decision tree for Web resources

Tito Sierra tito_sierra at ncsu.edu
Wed Mar 16 16:26:51 EST 2005


Roy, since you asked for ideas on this...

One strategy (which may or may not yield fruit in your case) is to 
create a local index of descriptions and/or keywords associated with 
your subject-specific databases, and run a search against this to 
construct your short list of subject specific databases to include in 
your metasearch.  I'm thinking here of a tool such as SWISH-E that can 
create an inverted index of a pile of text or XML files, each file 
containing keyword fodder for a specific database.  As part of your 
search pre-processing algorithm you can run an "AND" search of the user 
query against this locally stored index and see if you generate a 
signal for any of the subject-specific databases.  If you do, grab the 
top three most relevant subject-specific database identifiers and 
include them in your metasearch pipeline.

Of course, this strategy presumes you have keyword-rich database 
descriptions to work from which may not be the case.  There are ways of 
generating keyword fodder for stuff like this, perhaps by indexing the 
names of included journal titles or something similar.  You can update 
the index for this sort of thing on an as needed basis, or on a regular 
basis if your keyword fodder is evolving.  One benefit of this strategy 
is that could be done with a low performance impact to the user since 
the performance hit for running a keyword search on a locally stored 
inverted index will likely be unnoticeable to the user.

Tito Sierra
Digital Library Initiatives
North Carolina State University

On Mar 16, 2005, at 12:54 PM, Roy Tennant wrote:
>
> In other words, a search is entered, we search it in several large,
> general purpose databases, but simultaneously also use that query to
> try to determine which (probably top three) subject-focused databases
> would apply for that search, and display those to the user as well
> along with the search results from the general databases. We haven't
> yet figured out the best way of doing that database advisor function,
> which is why we are very interested in the work or Ross and others who
> are trying different methods. If anyone has any great ideas, I'm all
> ears! Thanks,
> Roy




More information about the Web4lib mailing list