February 11, 2005
I'm Not the Only One

On delicious-discuss, the problems del.icio.us' RSS feed has with ampersands and '<' characters has been tracked down to the Perl XML::RSS module. One person looked into the history of that modules' encoding issues, which go back to 2002.

"XML::RSS tends to automatically unescape HTML entities as it
reads them" (May 17, 2002)
-- http://www.webreference.com/scripts/sidebar/5.html

"Create a basic RSS feed with a dc:subject of "This & that", plain
ASCII.  Output the RSS feed using $rss->as_string.  The resulting
RSS feed will fail to validate as proper XML, as the '&' in the
subject has not been encoded to the ASCII string '&' in the
resultant RSS output.

This is contrary to the documentation, which states that
as_string will encode special characters.

This bug is responsible for virtually all of the reported
problems I've seen with XML::RSS, as well as virtually all of the
resentment.  May I ask why it's been stalled?"  (Mon Dec 8
09:59:59 2003)
-- http://rt.cpan.org/NoAuth/Bug.html?id=2285

And last but not least, it looks like this may still not be
resolved in version 1.05:
"- auto encode text?"
-- http://search.cpan.org/src/KELLAN/XML-RSS-1.05/TODO

I left this out of my previous post on this topic, but since del.icio.us came up... I reported a bug early in del.icio.us's life about its not escaping HTML entities in URL descriptions.

Posted by jjwiseman at February 11, 2005 10:07 AM
