[WEB4LIB] DOCTYPE declaration for invalid page

Thomas Dowling tdowling at ohiolink.edu
Wed Nov 1 15:06:38 EST 2000


> Hi all,
>
> I'm trying to get my head around the HTML standards and have been
running
> my pages through the w3c validator http://validator.w3.org It's been
very
> helpful for my buggy code.
>
> What is the advice from any standards gurus out there for pages that
cannot
> be validated. Should they include a DOCTYPE declaration or not?

Without a declared doctype, the Web Design Group's validator
(<URL:http://www.htmlhelp.com/tools/validator/>) and the W3C validator
(<URL:http://validator.w3.org/>) both assume HTML 4.01 Transitional, but
both complain to you about it.

In addition to this, both Mozilla and (I believe) IE5/Mac respond to some
doctype declarations by switching into their stricter rendering modes.  As
I recall, Mozilla responds this way to any doctype of HTML 4.0 Strict,
HTML 4.01 Strict, or XHTML 1.0 Strict.

>
> In particular a number of my pages cannot be validated due to the HTML
> standard for *not* using an ampersand in a URI, and instead using
"&amp;".
> http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2

This is currently a topic of much discussion on the
comp.infosystems.www.authoring.html newsgroup.  The validators are
entirely correct, though I admit I don't worry a lot about this error.  It
can be a problem, however, if the ampersand in the URL is followed by a
recognized entity name.  Some browsers will (and actually should) convert
those to the corresponding character, so source code like this:

  <a href="script.pl?chapter=1&sect=2&copy=3">

...gets rendered by the browser, and sent back to the server, as though
the URL were:

  script.pl?chapter=1§=2©=3  (apologies if this doesn't display correctly)
  script.pl?chapter=1[section symbol]=2[copyright symbol]=3

>
> A curious library problem with this part of the standard is in the 856
> field of a catalog record. If one still has telnet access to the catalog
it
> seems that the URI must not follow the HTML standard or else an
incorrect
> address will be displayed in the telnet version.
>

Note that the field separator for the QUERY_STRING variable *is* the
ampersand all by itself.  It is only when the URL is included in HTML that
the ampersand must/should be converted to "&amp;".  A sufficiently
intelligent catalog would convert as needed when the 856 field is rendered
in HTML.


Thomas Dowling
OhioLINK - Ohio Library and Information Network
tdowling at ohiolink.edu



More information about the Web4lib mailing list