SGML for Web Pages
Thomas W. Eland
tweland at mm.com
Mon Dec 18 19:36:43 EST 1995
Robb Scholten writes:
"I will wager that most of what is published today electronically will be
'lost' a hundred years from now. This very fact should strike some awe and
trepidation in the hearts of librarians who are committed to the task of
archiving information."
"The digital revolution requires some superhuman intervention on someone's
part to capture information for posterity. Open standards that are
easily upgraded will make our lives a hell of a lot easier in the decades
to come. Is HTML the best standard for display of text and graphics? I
don't know, but I wouldn't rush to transform my entire library into web
pages."
I think he is absolutely right. HTML is not the standard that we want to use to
mark-up important textual information. It can be used for many things, like
indexes, home pages, etc. In fact the index I created for Literacy and Adult
Education materials was created in two versions, on in HTML for today's web
browsers, for the other we made our own DTD using an SGML author/editor which is
much more sophisticated than HTML and takes advantage of SoftQuad's new Panaroma
SGML browser. (For any of those interested, you can find both versions of the
Internet Directory of Literacy and Adult Education Resources at
http://novel.nifl.gov/litdir/index.html. You can link to SoftQuad and download
a shareware copy of Panorama and see the difference).
Anyway, I think it is up to the library community to come up with or support a
standard, like we did with MARC, that we will use to create and archive
electronic information. I would encourage every librarian interested in this
issue to get their hands on the "Guidelines for Electronic Text Encoding and
Interchange" (TEI) which was produced by the Association for Computers and
Humanities, the Association of Computational Linguistics, and the Association
for Literary and Linguistics Computing. TEI is a DTD of SGML just like HTML.
However, the guidelines come in two volumes (about twice as long as AACR2) and
provide for complex mark-up. The real beauty of TEI for librarians is that it
allows the user to embed all the descriptive and subject elements that one would
use to create a MARC record in the text itself (actually in the text header).
There is no need to create a separate MARC record that links to an electronic
text. This provides for efficiency as well as portability. And since TEI is
created from SGML it conforms to all SGML standards. Personally, I think TEI
should be taught in library school cataloging courses. TEI may not be the
complete answer, but it goes a long way. It even contains a section discussing
header elements and their relationship to the MARC record for those who wish to
load TEI independent headers into MARC-based retrieval systems.
Unfortunately, there is no address given in the books for where to write for
info. on TEI, and of course I don't have the order form because I sent it in to
get my copy. I'll look around on the net to see if I can't get an address. Or
if someone else knows it, maybe you could post it.
Tom Eland, Librarian
Minneapolis Community and Technical College
1501 Hennepin Avenue
Minneapolis, MN 55403
tweland at mm.com
More information about the Web4lib
mailing list