[WEB4LIB] Re: Hard hyphen and HTML validation

Wed Oct 30 10:58:01 EST 2002

At 09:54 AM 10/30/2002, Dan Lester wrote:

>I'm sure that this will be heresy to some purists, but why not just
>ignore those validation errors?  It isn't a perfect world, and those
>are things beyond your control, so why worry?

This approach relies on the error-recovery routines built into browsers; 
you're essentially assuming that all browsers will handle undefined data 
the same way.  Likewise, you can include undefined characters in the 
127-159 range and hope that all browsers will act like you're using 
Windows.  You'll get away with it so much of the time that you could forget 
the few users who are left out in the cold because you didn't tighten up 
your code just that last little bit.

It also assumes you'll never encounter a query string where a variable name 
matches a character entity name.  It isn't hard to imagine links with query 
strings that include "&deg", "&copy", "&eta", or "&image"--all valid entity 
names in HTML 4.  How does a browser decide whether to treat those as 
character entities or spell them out?

Final thought: for hand-written URLs (garbage output from proprietary 
software is another issue), why is entifying ampersands an issue when we 
never think twice about hex-encoding other special characters in URLs, or 
converting spaces to plus signs in query strings?

Thomas Dowling
OhioLINK - Ohio Library and Information Network
tdowling at ohiolink.edu