Organizing Web Information

marc at ckm.ucsf.edu marc at ckm.ucsf.edu
Wed Jul 17 11:24:58 EDT 1996


BLIPS15 at BROWNVM.brown.edu wrote:
|Alta Vista and the others may need additional work on the retrieval algorithms
|and on the relevancy rankings, but I have yet to see any research that they
|do worse than LCSH when it is applied to a million+ record database.

I like Alta Vista because I can enter six+ search terms and still get
reasonable results searching for fun fluff.  Try that with a boolean 
OPAC search (you can't at the SFPL because their OPAC doesn't do booleans). 

But with full-text searching you don't have the set theory guarantees that
you get with regular, controlled-vocab boolean searching.
 
|Web pages are not books.  

Some web pages are not books.  Some books can be web pages.

We've just put The_Cigarette_Papers online.  It has an ISBN, a MARC record,
exists in both print and HTML/PDF.  We're preparing to put the AIDS_Knowledge_
Base online as well.  This is a 1500+pg evolving edited collection of 
chapters and sections that covers the broad scope of AIDS practice and theory.  
We hope to index each section with MeSH terms for reliable searching enriched
with links to our medline HTTP<->Z39.50 server gateway and other content
sources.

Just because standards efforts have been driven by profit and not by
interests that want to put knowledge online doesn't mean that we can't
take what's available and apply it to "real" content.

-marc


More information about the Web4lib mailing list