SGML for Web Pages

Nick Arnett narnett at Verity.COM
Mon Dec 18 15:36:59 EST 1995


At 11:50 AM 12/18/95, Keith Engwall wrote:
>Well, actually, format is the least complicated aspect of electronic
>storage.  It's relatively trivial to create an interpreter that will
>translate from one format to whatever's current.

It's not trivial to do so in a way that doesn't lose information.  In fact,
it's often impossible.  For searching, we license text extraction tools and
viewer technology, which we want to translate a wide variety of document
formats.  It's a continuing challenge.  For example, as far as I know,
there is no commercial tool for extracting text from PostScript, much less
one that will translate it to another format.

Nick




More information about the Web4lib mailing list