HTML->unformatted ascii text converters?
Ian Tilsed
I.J.Tilsed at exeter.ac.uk
Thu Dec 5 11:02:18 EST 1996
On Wed, 4 Dec 1996 15:44:53 -0800 Anthony Toyofuku wrote:
> Does anyone know of any software out there (including the source
> code - in C preferably) that takes HTML documents and outputs
> straight unformatted ASCII text (without any of the HTML tags). If
> it includes any of the rudimentary formatting from the docuement
> (like centering), that would be great.
In addition to what has been mentioned already, I sometimes use a
macro in MS Word that strips the HTML coding from text. I read about
it somewhere, although the exact reference escapes me. The macro text
is:
Sub MAIN
EditFind .Find = "\<*\>", .Direction = 0, .MatchCase = 0, .WholeWord =
0, .PatternMatch = 1, .SoundsLike = 0, .Format = 0, .Wrap = 2
StartOfDocument
While EditFindFound()
RepeatFind
EditClear
Wend
End Sub
I repeat that I take no credit for the macro - I am just forwarding it
on. I hope that it is of some use.
Regards,
Ian Tilsed
--
---------------------------------------------------------------------
Ian Tilsed Tel: (01392) 263876
Computing Development Officer (Library) Fax: (01392) 263871
University of Exeter UK E-mail (MIME OK): i.j.tilsed at exeter.ac.uk
http://www.ex.ac.uk/~ijtilsed/
---------------------------------------------------------------------
More information about the Web4lib
mailing list