[WEB4LIB:14750] Converting text to HTML
Paul F. Schaffner
pfs at umich.edu
Fri Aug 7 16:10:55 EDT 1998
On Fri, 7 Aug 1998, Walter W. Giesbrecht wrote:
> What we need to know is: how can we convert this ASCII file into
> HTML and make all the embedded URLs active links?
I'm not sure how much of the record you want to retain in HTML
form, or how you want to format it (anything's possible), but
if all you want to do is extract the URLs from the records and
convert them into active links for testing, any regexp-supporting
text editor would do it in seconds, or a simple Perl script like
this even faster (I'm only capable of the most simple ones
myself), regardless of platform:
#!\apps\Perl\bin\Perl.exe
while (<>) {
while (s,(^http:[^ ]+?),,) {
print ("\n<p><a href="$1">$1</a></p>"); }
}
You'd have to modify this depending on whether and where newlines
have been inserted in the record, or how much more of the record
you want to retain. This example assumes that you wish to retain
only the URLs themselves and that all URLs begin on new lines,
have no line breaks or spaces in them, and begin with http:.
If the records are in multiple files, just concatenate them as
you go, running the script like this:
perl extract-URLs.pl *.rec >> URL-list.html
But perhaps I've misunderstood the problem.
--------------------------------------------------------------------
Paul Schaffner | pfs at umich.edu | http://www-personal.umich.edu/~pfs/
SGML Production Coordinator, Middle English Compendium ('the e-MED')
301 Hatcher Library North, Univ. of Mich., Ann Arbor MI 48109-1205
More information about the Web4lib
mailing list