[WEB4LIB] Re: non-SGML characters
Thomas Dowling
tdowling at ohiolink.edu
Thu Jan 31 15:57:26 EST 2002
>I think you can solve the academic validation problem with
>"charset=Windows-1252". However, that won't solve the fact that there are
>browsers that will either not display any character there, or display some
>other character. I don't know of any browsers that actually change their
>character handling based on the charset in the content type. There may be
>*indexers* - but they'd want an actual HTTP header, not a meta tag (if I'm
>wrong, someone stop me before I make a fool of myself).
Well, that ship has sailed. Obviously, browsers respond to charsets in
order to display pages in non-Roman scripts. Also, while changing the
charset might make the actual character #149 valid, the numeric character
entity "•" still represents Unicode and is still invalid (you see,
there's SGML's and XML's "document character set" which isn't necessarily
your *document's* character set...suddenly my brain hurts).
So stick with UTF-8 and valid entities • or •.
Thomas Dowling
OhioLINK - Ohio Library and Information Network
tdowling at ohiolink.edu
More information about the Web4lib
mailing list