New version of Dublin Core Services

Ernesto Giralt egh_cu at yahoo.es
Sun Nov 14 12:54:46 EST 2004


Hi all, and sorry for the crossposting.

A new version - 0.2(beta)- of Dublin Core Services/Describethis
(http://www.describethis.com) has been published. This new version, as main
feature, brings  us an automatic generator of keywords: DCS incorporates now
a dictionary of 5300 words in 11 different languages, included Catalan,
Portuguese, Russian, Arabic, Italian, among others, that permits to
recognize and generate keywords automatically.  The system applies analitic
algorithms to find the best terms that better describe a given resource. The
new terms generated are added to the ones already included in the document,
although these are marked visually to avoid confusions with the terms
proposed by the own authors.  In the case of the HTML documents that do not
have included these type of metadata, the list of generated keywords can be
used as a guide and a valid proposal for the publishers and authors of these
contents.  
 
The current version delivers some corrected and improved features.  Among
these, can be emphasized the following ones: 
- The new list of metadata types and variants found in the documents HTML
now includes more than 70 elements in 3 different languages
- The service has incorporated a new parser to recognize and extract the
metadata for the Creative Commons licenses (see http://creativecommons.org).
- The RDF converter and generator has been improved to produce a valid and
more complete document.
- Now DCS has applied Web Standards to all the documents generated (XML,
XHTML and RDF) to include the elements that indicate the type of document
(DOCTYPE) and language marks.  
- The HTML documents parser now is capable of recognizing metadata placed in
other tags than traditional tags like META, concretely the LINK tag and the
comments embedded in the body of the text.  
 
In addition, due to the successful application in the blogs network, our
development team has dedicated special attention to the metadata and
particular  
characteristics of this "type" of online content. With these changes and
improvements the already existing references to DescribeThis and the future
ones  
will have a metadata extraction results more extensive and detailed.  
 
For the following version, DescribeThis will include: 
- An editor for Dublin Core registers and collections. 
- A multilingual and improved interface for DescribeThis - at present are
almost ready the first versions in Spanish and Catalan - 
- Selected dictionaries and thesauri to be applied to improve the
automatically generated/extracted results 
- Features for user subscription and register, so that results can be
closest to the needs and personal profile of each one.  
 
We wish to thank to the specialists and users in general that have sent us
valuable messages with recommendations, critics and advice. In special to
Daniel O'Connor, Paula A Markes and Eva Méndez, to whom we must thank for
many of the changes and improvements.
 
Again, thanks to all.  We will continue working to improve our services and
products.  
 
---------------------------------------------- 
Dublin Core Services is a set of web services that offers tools for the
description and automated analysis of online resources. Through the
interface that  
provides DescribeThis (http://www.describethis.com) it allows the management
and individual processing of the metadata collections that have been  
extracted or generated from the resources.  The site offers an easy-to-use
interface to indicate the resource to analyze and simple options to download
the results like XML, XHTML or RDF files.  

Send your messages to support at describethis.com.

--

!  Ernesto Giralt 
Team of Development of Dublin Core Services.  
 






More information about the Web4lib mailing list