New site devoted to Dublin Core and automatic metadata extraction

Ernesto Giralt egh_cu at
Wed Oct 20 20:47:47 EDT 2004

Hi all:

 Sorry for the cross-posting


This is the announcement of the publication of the first version (1,0 beta)
of DescribeThis (, a service designed for the  
automatic extraction of metadata from online resources.  The site offers an
easy to use interface where you can indicate the resource to analyze and how
to   download the results as XML, XHTML or RDF files. 

In the current version (1.0 BETA), the site's engine is able to find the
resources to process using keywords, full URLs or more complex queries with 
operators, like "ISBN", used to collect the bibliographic data for published
documents (see In the first
case  it works as a metasearch engine using other search engines to locate
the best sites where the resource can be found. The results returned back
contains all  the recognized and generated Dublin Core elements for the
requested resource and can be downloaded as RDF, XML or XHTML collections.

DescribeThis's main fields of applications :
- To support and extend the application and development of the Dublin Core
format as one of most appropriate metadata standards to describe or catalog

resources, digitals or not.
- To use the site as a tool to support the cataloguing of online resources,
oriented to information specialists and Internet users in general.
- To deliver services of automatic metadata management, designed for
managers of bibliographic and content databases.
- To create an efficient way for administrators and website authors to
dynamically provide metadata information about their sites to page crawlers,
spiders, agents, worms and other automatic indexing and site classification
systems, with the aim of contributing to the improvement of the whole
content organization.  

In the front page you can find several samples to
illustrate the normal operation of the service.

About Dublin Core Services
DescribeThis is a gateway to the functions of analysis, automatic conversion
and filtration of digital resources and formats, included as part of a group
of   web services and tools called Sand Dublin Core Services (DCS). DCS
provides support and software infrastructure to develop metadata management
applications  and services

In this version, DCS can automatically analyze and to generate metadata
registers for the following formats:  
- HTML and XHTML Documents
- Dublin Core/RDF 
- Dublin Core/XML 
- Dublin Core/HTML (META tags) 
- GIF, JPG (EXIF)  and other image formats
- RSS 
- bibTex 
- proprietary Formats XML (ex.:  Amazon XML Web Services) 

Support for other well-known formats like PDF, MARC , stream formats (MP3,
MPEG, etc), OAI directories, FOAF networks and others  will be added in the
near   future. A more complete information about available features of
DescribeThis and DCS will be added on the Sand corporate website as soon as


More information about the Web4lib mailing list