[WEB4LIB] Copernic Summarizer // FW: Metadata Conversion and
the Library OPAC
Tony Barry
me at Tony-Barry.emu.id.au
Tue Feb 27 07:20:02 EST 2001
At 5:48 PM -0800 26/2/01, ernest perez wrote:
>The following "abstract" (??) took about 5 seconds in Summarizer.
>This resulted from auto-analysis and summary of "Metadata Conversion
>and the Library OPAC," by Amanda Xu, at
><http://web.mit.edu/waynej/www/xu.htm>.
>
>This sample of the Copernic product output resulted from Ms. Xu's 11-page
>HTML document. It's not an easy text to summarize.
>
>You may be interested in comparing the summary to the original.
>Do YOU think it's a decent summary?
This kind of capability has been built into the Apple operating
system for about two years since MacOS 8.5 was released with Sherlock
http://www.asia.apple.com/sherlock/ which indexes all files on a
system so that you can search not just by file name but by content.
The "abstract" it produces is -
>Not all resources with metadata attached will be discovered by
>search engines, because the types of metadata a search engine
>gathers depends largely on the types of metadata templates that are
>profiled. [3] For those "Internet accessible but non-HTML based
>resources," [4] metadata can be accessed via protocols such as Whois
>++, [5] LDAP (Lightweight Directory Access Protocol), [6] Z39.50,
>[7] or other prioprietary search engines.... The most frequently
>mapped metadata formats are: IAFA (Internet Anonymous FTP Archive)
>templates, [12] Dublin Core metadata sets, [13] USMARC, GILS
>(Government Information Locator Services), [14] SGML TEI Header,
>EAD, [15] and Z39.50 tag set G. [16] Among them, USMARC, TEI
>Headers, EAD, GILS, and Dublin Core can represent the center of
>metadata mapping.
>
>By mapping the content, syntax, and data elements of various
>metadata models, correct metadata conversion between various
>syntaxes can be assured. [17] Sketchy records such as IAFA and
>Dublin Core records can be accurately upgraded during the migration
>so as to satisfy the needs of rich description records such as
>USMARC, TEI Header, and EAD....
>
>Specifically speaking, how can those "aggregated metadata objects"
>[18] such as USMARC bibliographic records, SGML metadata records,
>Dublin Core metadata records, GILS records, finding aids in EAD, and
>other future metadata records be organized in a consistent way so
>that they can be interchanged in a distributed networking
>environment?...
>
>Many libraries are using metadata already. [26, 27, 28, 29] They
>have created EAD records for archival description and finding aids,
>SGML TEI Headers for electronic text, Dublin Core for simplified
>description of networked resources, and GILS for Government
>Information Locator Services....
>
>Therefore, if library OPACs are used as gateways to access all the
>databases, including metadata repositories either on library Web
>sites or on local databases, a metadata conversion system built into
>the gateway is needed to ensure metadata interchange....
>
>In situation A, metadata can be extracted by automatically matching
>semantically similar elements and structures found in standard
>metadata format templates, namely the templates for Dublin Core
>Metadata Sets, EAD, TEI Header, GILS, and the USMARC format....
>
>* For locally created metadata on the repository (that is,
>specialized metadata mounted on local databases), the metadata
>conversion system will identify the content-bearing metadata
>elements, load them into a specified USMARC template, and convert
>and index them into existing databases; * When a library OPAC is
>used as a gateway to access remote metadata repositories, a metadata
>conversion system will verify if the resources contain
>meta-information, load the data elements into metadata templates,
>confirm the metadata format and encoding level, and then display
>metadata in user-specified formats....
>
>Another reason for using USMARC as a common template is that the
>Electronic Location and Access (856) field has been added to USMARC,
>making it possible to connect USMARC records to their source data
>directly via sophisticated OPACs.
Serac software have extended it's capability to the web with
iRemember http://www.seracsoftware.com/iremember.html
IRemember watches the web traffic to your machine and indexes every
page you view. I now have a database of 9500 pages which I find
extremely useful - often more so that the big search engines.
Tony
--
phone +61 2 6241 7659
mailto:me at Tony-Barry.emu.id.au
http://purl.oclc.org/NET/Tony.Barry
More information about the Web4lib
mailing list