Classification Tools

Ken McCracken aa175 at torfree.net
Sun Mar 14 17:16:54 EST 1999


On Sat, 6 Mar 1999 web4lib at webjunction.org wrote:

> Topic No. 1
> 
> Date: Fri, 5 Mar 1999 18:07:02 -0500
> From: "Kat Hagedorn" <kat at argus-inc.com>
> Subject: Classification Tools 
> To: <web4lib at webjunction.org>
> Message-ID: <017001be675c$e18fc360$0400a8c0 at kat.argus-inc.com>
> 
> I'm interested in getting some recommendations on automated classification
> tools.
> 
> I am working on a project to build a browsing interface for Early English
> documents. We're trying to find a good approach for creating and populating a
> topic tree to provide subject-based browsing. Our plan is to explore
> semi-automated solutions for indexing the documents. I'm interested in finding
> out about software tools that will automatically classify the documents, or
> populate a manually created topic tree. We are aware of some products already,
> such as Verity's Content Classification Engine.
> 

Not really about automation...

Until a couple of months ago the free-directory project was active in
coding a followup to the the volunteer-edited gnuhoo project which had
been commercially appropriated as Newhoo. Netscape on acquiring the
project tried to make things right by freely (BSD-style)  licensing the
data collected by volunteer editors to anyone to include (in any other
interface although the hope it seems is for dmoz to be the main
repository) through their open directory project. 
http://dmoz.org/license.html

So far dmoz.org are not releasing the code operating behind the scenes.

The Free-directory people were happy enough with this arrangement to give
up on their designing and coding efforts. One of the last posts to the
free-directory list was from Netscape dmoz's Rich Skrenta. 

http://www.mmedia.is/free-directory/archive/541/542/#544

There were some interesting design discussions in that same archive and
there are probably some code fragments available that others could start
out with if they wanted to carry the torch further.

Ken McCracken
Toronto Freenet Info Resources



BTW thank you for the XML book recommnendations. 

One other recommendation that didn't make it to the list was Neil
Bradley's XML companion. 


> We'd appreciate any advice or recommendations. Please reply to me and not the
> list. I will summarize responses for the entire list. Thanks very much!
> 
> [Apologies for cross-posting.]
> ________________________________________
> Kat Hagedorn :: Information Architect
> Argus Associates :: http://argus-inc.com
> 734-913-0010 :: kat at argus-inc.com





More information about the Web4lib mailing list