Looking for text to HTML Conversion Software

Jian Liu jiliu at script.lib.indiana.edu
Fri Sep 12 12:15:27 EDT 1997


Do those documents have some structural features that can be
identified? If so, you can ask someone to write some simple
perl scripts to insert html codes into them.

One example I have here, our Lilly Library manuscript collection
description consists of more than 2,000 small files, in ASCII, like
the following:

                         ADAMS MSS.

   The Adams mss., 1912-1981, are the letters, photographs,
and writings by and about Booth Tarkington, 1869-1946, author
of Indianapolis and Kennebunkport, Maine, collected by Reily
Gibson Adams, 1911-    , corporation official of Indianapolis
whose wife was a relative of Tarkington.
   Biographical materials include clippings and printed
matter about Tarkington, beginning with a copy of his
marriage certificate in 1912 and concluding with a notice of
the observance of his 112th birthday.  Among the undated
items are some of his bookplates.
   Collection size:  97 items
   For more information about this collection and any related
materials contact the Manuscripts Department, Lilly Library,
Indiana University, Bloomington, IN  47405 -- Telephone:
(812) 855-2452.

A small perl script can insert the basic html codes, grab the title
and insert <p> at the beginning of each paragraph, and text is
ready for the web. I then use Isearch to index it and use htmltoc, 
a small perl program that I got from the net long time ago, to 
create the browsing section. See:
http://www.indiana.edu/~liblilly/Iforms/mss.html

Jian
Indiana University Libraries

> 
> Hello
> I am looking for a super  product that
> would be able to take a text document and
> covert it into HTML. We are about to convert
> 7000+ documents from the mainframe to the Web
>  and realize Front Page and Word 97 would be too slow. 
> Can anyone recommend a product that would do this efficiently
> and with accuracy? Thanks for any recommendations!
> 
> Libby Whitcomb
> Corporate Services-Information Technology Team
> libby at ti.com
> 972-997-5277
> 
> 
> 



More information about the Web4lib mailing list