a few more steps together

Gary Fouty g-fout at maroon.tc.umn.edu
Thu Jun 6 04:20:49 EDT 1996


 I have been reminded both in private e-mail and on the list that there some
really exciting and  innovative new approaches being developed for
information retrieval.  I agree, and look forward to seeing even more.  I do
read Wired as well as Online.  I was even glancing through a recent issue of
Commun of the ACM the other day, and saw something that illustrates my
original point.  There was report of a pilot project to provide full text
(indexed) of selected chemical journals to chemists at one university.  The
reponse was mostly positive, but the chemists remarked that they missed the
CAS registry numbers.  For those of you who are not chemists or technical
librarians, these numbers are part of an ongoing effort at Chemical
Abstracts Service to identify, and assign a unique number to, every known
chemical (they are up to about 15 million now).  It is often thought that in
the 'hard sciences' like chemistry the vocabulary is precise and clear.  In
fact there is immense variability in the ways chemicals are named and
referred to in the literature.  By manually assigning these numbers to
articles cited in their database, the folks at CAS are using a special form
of controlled vocabulary, with strong authority control.  Your typical
chemist doesn't know beans about authority control, but they sure knew when
this valuable access mode was missing.  Admittedly this is a special case
where high recall is particularly valuable.  But I doubt it is unique,
except in the sense that the controlled vocabulary was so unusual (eg.
64-17-5) that it was easily missed.  Users in other situations would be less
likely to notice.
        Whoa, the irrelevancy alarm just went off.  To bring this back to
web4lib, I see a lot of Web-based products (for catalogs and databases)
coming out, with very mixed quality.  Many have not overcome the essential
statelessness of the Web, even those using Z39.50 often seem not to support
any authority structure.  I suppose we could go with the flow and accept the
lowest common denominator, but I guess I am too much of an old fuddy-duddy
to do so gracefully.  Which brings me to another question, what is the
measure of success in assessing tools in the new environment?  Good old
recall and precision have some value, but as we move more into relevancy
ranking and interactive querying these are less appropriate.  So what are
our standards?  How do we know when we have met them?  I don't know.  Anyone??


Gary Fouty                       Science/Engineering Library       
108 Walter Library              Univ. Minnesota -- Twin Cities      
117 Pleasant St. S.E. #108             (612)-624-1851
Minneapolis MN 55455                   g-fout at tc.umn.edu



More information about the Web4lib mailing list