[WEB4LIB:14750] Converting text to HTML
Garbe Grace
garbe_grace at dsmc.dsm.mil
Fri Aug 7 15:06:00 EDT 1998
Walter:
I don't know which report you are using, but I use the Entrylist
report and choose the catalog entry IDs of 245 and 856 (fields) of
each bib record. We have about 800 of these. When the report prints,
only those two fields appear, instead of the lengthier data in your
report.
After the report is done, I download it to my pc and open it in
Word (I have Word97, not that it matters). Delete the log pages.
Then use the Edit/Replace feature to find all of the |uhttp strings
and replace those with <A HREF="http. Your 856 field seems to have an
extra subfield. I replace the |2http with ">URL</A>. Then I save the
file as text only. You want a .txt extension, not .doc.
Do not allow Word to convert this to HTML. It does not convert
successfully in my experience. Next I use WordPad (Win95 Write) and
add the necessary HTML tags to create an html document:
<HTML>
<HEAD>Catlog Records with 856 Field</HEAD>
<BODY>
your file of urls
</BODY>
</HTML>
Save as text only and type in .htm as the extension. Do not allow Word to add
it's formatting to this. You have to be very careful about selecting "save as
text only." I haven't tried this with WordPerfect, so don't know what steps to
take here. It is possible to do this with Word 6 also, since that is really
what WordPad in Win95 is. I much prefer doing it in Word 6.0, but it was
removed from my computer when we got Office 97.
Now you're all set to run this through a link checker. I use LinkBot which
checks my 800 links in about 4 minutes. I then check out the bad ones myself.
If you need further clarification or help setting up the report, please email
me. One of your fellow Canadians, Charley Pennell, has developed some great
written documentation for Sirsi cataloging procedures. You can find the
Cataloguer's Toolbox with the URL info at
<http://www.mun.ca/library/cat/URLprocedures.htm>.
Grace Garbe
Systems Librarian
ZAI-AMELEX
Defense Systems Management College
Ft. Belvoir, VA
garbe_grace at dsmc.dsm.mil
______________________________ Reply Separator _________________________________
Subject: [WEB4LIB:14750] Converting text to HTML
Author: walterg at yorku.ca (Walter W. Giesbrecht) at INTERNET
Date: 8/7/98 9:52 AM
Our library has been adding URLs to the catalogue records of
electronic journals for some time now. The staff in Bibliographic
Services need a relatively simple way to check the links in these
records on a periodic basis. The catalogue software (Sirsi's
UNICORN) allows them to generate a report (in ASCII) of the
catalogue records that include URLs (a single record in this
report looks something like this):
via web browser by entering the following URL:
http://www.idealibrary.com/cgi-bin/links/toc/ab
ISSN: 0 : |a0003-2697
Subject: 0 : |aBiochemistry|xPeriodicals.|?UNAUTHORIZED
Electronic access: 7 :
|uhttp://www.idealibrary.com/cgi-bin/links/toc/ab|2
http|zhttp://www.idealibrary.com/cgi-bin/links/toc/ab
YORK--
Location: YORKSTEACIE --
Textual holdings: v.10, 1965 -
What we need to know is: how can we convert this ASCII file into
HTML and make all the embedded URLs active links? Once we get
this, we can use link-checking software to test them all. Word 97
will make a URL into a link by putting a hard return after the
URL, but doing this is impractical when the most recent report
was just over 800K in size! Any ideas?
--
Walter W. Giesbrecht walterg at yorku.ca
York University Libraries (416)736-2100 ext. 77551
Toronto, Ontario, CANADA 113 SSB
More information about the Web4lib
mailing list