August 04, 2006
Don't Forget Giant Scorpions and Jan-Michael Vincent
Dave Walker: “when the world ends, the only things left will be cockroaches, rats, Keith Richards, and mangled text that has been escaped one-too-many or one-too-few times.”
(It's getting close to two years now since I posted about CLiki's HTML-handling bug.)
Posted by jjwiseman at August 04, 2006 12:26 PM
Dave Walker posted that. Not Dave Winer.
That's the sort of quote which is really funny when you've been there, and incomprehensible if you haven't.
I have encountered these sorts of bugs more often than I can count. Typically, the fundamental problem is that people forget the "type" of a piece of content when encapsulating it in a wrapper of some sort (think HTML embedded within an XML wrapper). With respect to all the XML-based content floating around, you have to ask whether a piece of content is allowed to be markup, or not? I had this trouble on my company's corporate web site the other day. We have press releases that are stored in a DB. The releases are allowed to use HTML tags to enhance the presentation. We have an interface that allows you to enter/edit a press release within a web form. Of course, there was a bug that an ampersand couldn't make a round trip through the edit cycle because something wasn't being escaped correctly. In the case of *displaying* a press release, the correct thing is to take the content and pour it into the HTML template, unescaped because you want the content tags to merge with the template. In the case of the *editing* interface, you need to escape the release text because isn't actual content at that point, it's just text data.