[Web4lib] RSS and diacritics

Bob Duncan duncanr at lafayette.edu
Tue Nov 27 17:58:47 EST 2007


At 03:56 PM 11/27/2007, Jonathan Gorman wrote:
>Apologizes, In rereading I realized I mis-interpreted what you were 
>saying.  I thought you had two distinct problems (using html 
>character entities) and issues with diacritics.

Phew!  I thought I was going to have to attempt a reply to your first 
response. ;o)

>The answer as far as the entities?  RSS can be a mess ;).  RSS feeds 
>are XML.  Sadly, a widespread practice has occurred of using 
>"escaped html" in fields of the RSS feeds.  There's no way to ensure 
>that these escaping nightmares will be parsed correctly.
>
>HTML defines some character entities, but RSS doesn't have all of 
>them.  You can attempt to add these characters to the RSS feed via 
>including them in a Doctype declaration at the beginning of the 
>feed.  This wikipedia page looks like it has some examples of that: 
>http://en.wikipedia.org/wiki/XML.
>
>The best solution?  Not really sure.  I'd lean towards not using 
>"escaped html" in my RSS feed.  Instead use just rss and the 
>character references, which should display cleanly assuming that the 
>rss feeder isn't junk.
>
>(And by character reference, I mean use &#x..; where .. is the 
>appropriate code point).

Thanks.  I think that will do it.  I was using name-based references 
(Egrave, etc.) and escaping the ampersand, which worked in most feed 
readers but not in everything capable of displaying a feed.  The 
numeric character references work fine in all apps tested so far.

One other question:  which numeric reference is preferable?  For 
example, both É and É (xC9 and 201) produce a Latin capital 
E acute.  Are there good reasons to use one over the other?  (And is 
either more likely than the other to be correctly rendered by 
browsers in non-RSS situations?)

Thanks,

Bob Duncan


~!~!~!~!~!~!~!~!~!~!~!~!~
Robert E. Duncan
Systems Librarian
Editor of IT Communications
Lafayette College
Easton, PA  18042
duncanr at lafayette.edu
http://www.library.lafayette.edu/ 




More information about the Web4lib mailing list