start working on your resumes...

Nick Arnett narnett at verity.com
Thu Jan 30 14:02:38 EST 1997


At 12:52 PM 1/30/97 +0000, Cliff Urr wrote:

>With regard to your product, can you speak of an actual situation(s) 
>where people are free of the boring, rote parts of categorizing and 
>can really focus on the creative parts?

It is, in fact, early to talk about this, especially with regard to specific
products that we may someday announce.  Prototypes have demonstrated well
that subject-oriented classification of the style that an encyclopedia would
use is highly automatable.  We've also had computer-human bake-offs for news
"slugging," (tagging from a controlled vocabulary).  The computer's accuracy
is virtually equal to the humans' and the mistakes are consistent (unlike
the humans' mistakes).  Other areas work well, too. For example, if I want
to have a category of "introductory documents," the system can easily learn
the words that distinguish them, which are words such as "introductory,
tutorial, FAQ, primer" and so forth.  (Note, our search engine looks at
additional features of a document, not just word frequency.  I'm simplifying.)

We're not claiming that software will be able to eliminate *all* of the
repetitive, uncreative categorization activities.  The examples are rather
different from library categorization in the type of documents and
categories, including the granularity of the categories.

I'm not aware of any significant commercial products that claim to do this
kind of thing; I'm eager to hear of any.  We are aware of the research
behind the "Science" article; the magazine certainly overstated what was
learned (and the research understated what's possible with ordinary computers).

Nick

---------------------------------------
Verity Inc.
Connecting People with Information

Product Manager, Categorization and Visualization
408-542-2164; home office 408-369-1233; fax 408-541-1600
http://www.verity.com



More information about the Web4lib mailing list