[WEB4LIB] Appropriate Organizational Scheme for Diverse Web Collection

Fri Jul 21 14:39:50 EDT 2000

Hire a librarian.

 I'm serious!  It's why we have masters' degrees.  If we could explain in one email message how to set up an overarching knowledge schema to be used to catalog a large, diverse body of knowledge, and have the system
be operable by non-information specialists, then we'd have beat out Yahoo!, Excite, AltaVista, etcetera a looong time ago.  It's why Yahoo! has scores of librarians on the payroll.

>From having built a couple such systems, my quickanddirty advice is:
    Include both category searching and keyword searching
    Think like your users.  Avoid the trap of thinking like a specialist.

Regards,
---Julia E. Schult
Access/Electronic Services Librarian
Elmira College
Jschult at elmira.edu

Gerry Mckiernan wrote:

> >      Overview--- We need to organize content on the Fairness.com
> > website via an overarching knowledge schema that:
> >
> >   a)  is easy enough for non-librarians (web system administrators and
> > hopefully authors of articles) to use to catalog, in some reasonable way,
> > new articles and essays that get added to the database;
> >
> >   b)  would allow users to fairly easily locate information in a variety of
> > formats and media (newspaper articles, interviews, pictures, sound files,
> > etc.) that all address the same subject. Most of the items to be cataloged
> > will be (i) articles from newspapers and magazines, and (ii) essays from
> > writers, professionals and non-professionals "published" via my site.
> >
> > I have no information science or library science background, but have
> > worked in the book business (bookstores and book publishing) and thus am
> > aware of Library of Congress and Dewey systems, though I don't know the
> > details of how either work at all.
> >
> > Using the Library of Congress system as an example, the following
> > factors seem key to me:
> >
> >    1)  Breadth of the schema--- the schema must be very very broad to meet
> > our needs (i.e. fairness issues range in scope from art forgeries to
> > scientific research fraud). On this factor, the Library of Congress scheme,
> > which seems capable of addressing every topic known to mankind, seems broad
> > enough!  :->
> >
> >    2)  Flexibility of the schema to different content formats and media
> > (print and electronic). For example, could the Library of Congress' book ID
> > system for books be adapted for use with periodical articles, movies, and
> > other media as well as books?
> >
> >    3)  Ease of administration--- This one is key... we have no staff
> > librarians but anticipate having a huge amount of material in our
> > database. Again taking Library of Congress as an example, I have heard that
> > using LC system requires (i) significant training and knowledge as a
> > prerequisite, and (ii) a lot of time per item categorized. Fairness.com
> > does *not* have a staff to do such categorizing, so assignments of ID#s
> > will have to be done either (1) in an automated manner, perhaps by
> > human-assigned keywords, or (2) by volunteers with no library training.
> >
> >     4)  Links between items--- it would be helpful to permit one item in
> > the schema to link to one or more other items (e.g. links between articles
> > comprising day-by-day coverage of a topic in a newspaper).
> >
> >     5)  Whatever system I implement must to be able to scale up to
> > hundreds of thousands of entries.
> >
> >
> > Hopefully this communicates at least the essentials of what we're
> > looking for. It seems likely that we should be able to build on some
> > existing library science foundations and not have to "re-invent the wheel"
> > (and probably re-invent it in a clumsy way!).
> >
> > My questions are:
> >
> >    a)  what existing schemas would work (not necessarily perfectly or
> > easily!) for our purposes?
> >    b)  how would I evaluate and choose among candidate schemas,
> > assuming there are several that might work?
> >
> > --------------
> >
> > more recent than the overview above:
> >
> >
>    ......  others have mentioned the Dublin Core
> > project to me, which looks interesting based on a quick view of their
> > website (i.e. emphasis on simplicity, at least relative simplicity)... but
> > as I understand it DC doesn't address cataloging/indexing.
> >
> > What my project is most immediately concerned with seems to fall
> > under the heading of "periodical indexing", i.e. I want to to be able to
> > read the morning newspaper and add articles on welfare reform, campaign
> > finance, and any brand new topics immediately into my database of articles
> > coded to be "in with" other articles on the same or similar topics.
> >
> > ------------
> >
> > Thanks in advance;  I'll appreciate any ideas whatsoever on this topic (or
> > related topics for which I may not have the background to realize they need
> > to be thought out!).
> >
> >
> > ===============================================
> >     Dan Doernberg, dan at fairness.com
> >     Fairness.com LLC
> >     The Online Guide to Fair Deals and Fair Treatment
> >     Phone: + 1 804/975-0780
> >     Fax:     + 1 804/975-0790
> >
> > ===============================================

--