[Web4lib] full text versus metadata

Weibel,Stu weibel at oclc.org
Thu Nov 10 09:23:51 EST 2005


I've had a number of interesting replies to my post on metadata versus
full-text retrieval (some on-list, others off).  The most interesting
idea among them is that of complementarity.  Certainly we all agree that
Google-like searching is powerful and useful.  Our further hope and
prejudice is that augmenting it with metadata search will improve
retrieval in some use-cases with some resource classes.

What are the domains of investigation?  A quick list from the top of my
head:

Nature of metadata
   - User-created
   - Library-created versus...?
   - Richness (MARC...DC...MODS...IEEE-LOM...ONIX...)

Nature of resources
   - Age
   - Type (books, articles, web resources, collections...)

Information use cases
   - Scholarly 
   - Commercial
   - End-user medical, legal
   - Governmental
   - User-types
   - ?

Anyone remember the TREC effort? http://trec.nist.gov/

Maybe its time for ReMIX: Resource Metadata and IndeXing Experiment

- a standard experimental corpus, balanced (whatever that means) and
freely available
- Indexes and linking information available to all
- Metadata available to all
- Open-Data repositories for the experimental results

In other words, an open-access community based project where the gradual
accretion of knowledge on the subject would help us understand the
benefits of each mode, and combined modes as well, so as improve
retrieval performance over time.

stu, who would probably get more eyeballs to his blog if he quoted the
correct URL:
http://weibel-lines.typepad.COM


		
		
		


More information about the Web4lib mailing list