[Web4lib] RSS and diacritics
Bob Duncan
duncanr at lafayette.edu
Tue Nov 27 17:58:47 EST 2007
At 03:56 PM 11/27/2007, Jonathan Gorman wrote:
>Apologizes, In rereading I realized I mis-interpreted what you were
>saying. I thought you had two distinct problems (using html
>character entities) and issues with diacritics.
Phew! I thought I was going to have to attempt a reply to your first
response. ;o)
>The answer as far as the entities? RSS can be a mess ;). RSS feeds
>are XML. Sadly, a widespread practice has occurred of using
>"escaped html" in fields of the RSS feeds. There's no way to ensure
>that these escaping nightmares will be parsed correctly.
>
>HTML defines some character entities, but RSS doesn't have all of
>them. You can attempt to add these characters to the RSS feed via
>including them in a Doctype declaration at the beginning of the
>feed. This wikipedia page looks like it has some examples of that:
>http://en.wikipedia.org/wiki/XML.
>
>The best solution? Not really sure. I'd lean towards not using
>"escaped html" in my RSS feed. Instead use just rss and the
>character references, which should display cleanly assuming that the
>rss feeder isn't junk.
>
>(And by character reference, I mean use &#x..; where .. is the
>appropriate code point).
Thanks. I think that will do it. I was using name-based references
(Egrave, etc.) and escaping the ampersand, which worked in most feed
readers but not in everything capable of displaying a feed. The
numeric character references work fine in all apps tested so far.
One other question: which numeric reference is preferable? For
example, both É and É (xC9 and 201) produce a Latin capital
E acute. Are there good reasons to use one over the other? (And is
either more likely than the other to be correctly rendered by
browsers in non-RSS situations?)
Thanks,
Bob Duncan
~!~!~!~!~!~!~!~!~!~!~!~!~
Robert E. Duncan
Systems Librarian
Editor of IT Communications
Lafayette College
Easton, PA 18042
duncanr at lafayette.edu
http://www.library.lafayette.edu/
More information about the Web4lib
mailing list