September 21, 2002
You know what is particularly dumb? That many (and some days it feels like possibly most) url strings are not valid html. What the hell was the W3C thinking?
At least 75% of the html errors that slip by me when I write a post but get caught by the validator are due to pasted urls containing '&', and it's the W3C's fault.
Posted by jjwiseman at September 21, 2002 08:52 AM
Photo by Harry Whittier Frees
Being forced to subsititue & for & is the *most* retarded thing about XML (technically this isn't valid HTML either).
The rationale for this is that & is for entities. Talk about lazy parsing. First of all, entities are stupid in and of themselves. There are better solutions. Second, there is a required syntax for entities. They have to be amperstand + [some number of characters] + semicolon. So a plain amperstand should not break this rule. An amperstand next to some characters not followed by a semicolon should not break this rule. An amperstand which is next to some characters and is followed by a semilcolon but which refers to an undefined entity should not meet this rule.
Having to escape > and < is similarily stupid, though somewhat harder to work around.
A troll but still a troll.
The problem is not standard wise in this case but.... implementation related. Why your implementation (the software you are using) is not able to escape these characters when you copy and paste.
When you do a perl program, you have reserved characters too... like # for example.