recent discussion of WWW cataloging?

Wed Oct 11 01:12:27 EDT 1995

On Tue, 10 Oct 1995 20:26:59 -0700 <tweland at mm.com> said:
>knowledge librarians bring to the cataloging effort.  For example,
>Yahoo..."
>I would agree that the majority of web indexes are of the automated Yahoo
>variety, and I would argue that all of them are poor indexing tools.  I have

  I'll disagree with this analysis.  They may be "poor" compared to LC
subject headings, but so what?  Using LC subject headings to index the web
would be practically impossible, no matter WHAT dreams OCLC has.  Yesterday
or the day before Yahoo added OVER 1500 NEW SITES.  LC doesn't do that
many new titles a day, ever.  Neither does any other library.  Only a
couple add that many VOLUMES a day.
  Also, most who work reference can confirm how poorly students understand
concepts such as subject headings....and that most of them instinctively
use natural language terms.  Sure, they have to be taught to use alternative
terms, to use wildcards, and so forth....but they learn those things easily
and quickly in most cases.  Despite the suggestion above that LC headings
may be "better", I'm not at all sure they are.

>used most, and not one would last a day in a library.  Patrons would be so
>frustrated with the number of false hits that they would stop using the index.

   Nonsense.  Many libraries have public access web browsers with Yahoo
and other services listed on pages as reference and search sites.  Students
use them quickly and easily, and are satisfied with what they find at LEAST
as often as they are with online catalog or CDROM databases.  Should they
be satisfied with what they find?  I'll leave that as an argument for the
theorists who want to count angels on heads of pins.

>On the other hand, OCLC is attempting to bring a more traditional and
>disciplined approach to indexing the Internet.  Their WebFirst product indexes
>and abstracts Internet resources.  The initial pages are gathered using an
>automated search mechanism.  Worth while sites are sent on to human beings who
>review the content and pass on worthy resources to an abstractor and indexer
>that assigns subject headings and an abstract.  A demo of the system can bee
>viewed at OCLC's web server: http://www.oclc.org/

  I imagine everyone here is familiar with this...and it is an admirable
project...and may even turn into what OCLC wants and needs....a moneymaker.
NOTE: I have no objection to OCLC making money and surviving, but we need
to be careful to avoid ascribing lofty motives to those who want, naturally,
to see their business survive.  Yes, that includes Yahoo, Lycos, and many
others.

>Other librarians, like myself, are designing subject specific indexes of
>Internet resources.  See my Internet Directory of Literacy and Adult Education
>Resources located at the National Institute for Literacy:
>http//novel.nifl.gov/litdir/eland.htm.  It is currently being mounted and
>should
>be fully operational by Oct. 20, 1995.

  Sure.  Good.  I'll check it out and probably link to it.  But hundreds of
us are doing the same thing, each customized to some extent to what our local
users need and want...and that is great.  But those of us doing it aren't
any better than anyone else who isn't.

>It is my contention that either librarians get involved and index the Internet,
>or we will be stuck once again using inferior indexes designed by those who do
>not understand how people search for and use information.  (I also think we

   Note: I don't think most librarians KNOW how people use and search for
information.  Most of us are stuck with the crap we were taught in library
school 4 or 40 years ago.

   Also, many information producers have learned how to use the tools that
the information organizers/indexers use.  They started learning this with
KWIC indexes that '60s computers produced...they started using descriptive
titles that were reminiscent of 19th century titles and quit using cute
titles.  Web page producers have also learned this.  There are some commercial
sex sites that have repeated their "keywords" many times on the home page...
things like "sex, horny, hot, sex, babes, nudes, nude, sex, nude, naked,
sex, pictures, graphics, sex, hot, ....."  (many of them also use four
letter words we all know but that I'm not using here) [who ever said the
ol' cyclops had NO judgment or discretion?  o-)  ]   Those tactics are
guaranteed to make them come out near the top of a weighted list produced
by Lycos or other engines that index and count word occurrences.

>should get involved and produce good indexes to print resources for the 80% of
>the world's knowledge that H.W. Wilson et. al. doesn't bother to index, and
>therefore we don't see fit to collect.)

  Sure, this is a fine idea too.  But the problem that most academic
patrons have is finding TOO MUCH information for their needs.  This changes
as the get to graduate school, but Suzy Freshman doesn't NEED an ILL for
her paper on "abortion", and she doesn't need more than a half dozen
articles at best, no matter the source.

  One of the REAL problems we need to deal with is teaching people how
to deal with information overload, how to recognize an "objective" source
from a "biased" source, how to know what BS looks and smells like, and
so forth.  We also have to teach them that "just because it is on the web
or in print doesn't make it true".

  I'll stop for now...but since you got me started....

cyclops

  Dan Lester, Network Information Coordinator
  Albertsons Library, Boise State University, Boise, Idaho 83725 USA
  alileste at idbsu.idbsu.edu             http://cyclops.idbsu.edu/
  How can one fool make another wise?  Kansas, "No One Together," 1979