SGML for Web Pages
Nick Arnett
narnett at Verity.COM
Mon Dec 18 15:36:59 EST 1995
At 11:50 AM 12/18/95, Keith Engwall wrote:
>Well, actually, format is the least complicated aspect of electronic
>storage. It's relatively trivial to create an interpreter that will
>translate from one format to whatever's current.
It's not trivial to do so in a way that doesn't lose information. In fact,
it's often impossible. For searching, we license text extraction tools and
viewer technology, which we want to translate a wide variety of document
formats. It's a continuing challenge. For example, as far as I know,
there is no commercial tool for extracting text from PostScript, much less
one that will translate it to another format.
Nick
More information about the Web4lib
mailing list