Choosing a doctype
So making sure that your webpages validate is important because it will make the pages render more consistently across different browsers, meaning you can spend less time debugging cross browser incompatibility and more time with your loved ones/down the pub/plotting world domination.
But what are we going to validate against? The current most popular options are:
- HTML 4.01 Transitional
- HTML 4.01 Strict
- XHTML 1.0 Transitional
- XHTML 1.0 Strict
HTML or XHTML?
The only difference between HTML 4.01 and XHTML 1.0 is syntax. XHTML documents are XML documents and HTML documents aren’t. This means that XHTML has some extra rules to make it compatible with XML (the most obvious being lowercase tags and terminating empty elements).
There are no differences between the tags and attributes supported in XHTML 1.0 and HTML 4.01. Each version of these two standards (strict, transitional and frameset) supports all of the same elements. The only difference is the syntax. XHTML 1.1 deprecates some more elements and attributes that were allowable in XHTML 1.0, mostly to do with presentation because presentation should be controlled by using CSS.
The main advantages of using regular HTML are:
- It is supported by all browsers. Older browsers and alternative browsers like mobile browsers may not understand XHTML. Even IE6 and IE7 support is a bit dubious.
- The syntax is simpler. Empty tags don’t need to be terminated. Elements like tbody can be automatically inferred.
- Even if there are mistakes, the page will still attempt to render and in most cases will still be usable by the person viewing the page. In some browsers when XHTML is served as “application/xhtml+xml” errors in markup will cause an error message to be displayed rather than the page’s content.
The main advantages of using XHTML are:
- It’s XML so you can use XML technologies like XSLT and XPATH on the document.
- Markup mistakes are easy to find because some browsers will display an error message if they’re told to treat the page as an xml document. This in theory makes maintenance easier but may make it a bad choice for documents with dynamic or user contributed content.
- In theory it’s extensible so it’s possible to plug other XML standards (imagine scalable vector graphics) into the same document. This doesn’t have widespread browser support yet.
There’s no good answer for ASP.NET developers
Choosing between XHTML and HTML for ASP.NET developers is choosing between a rock and a hard place.
ASP.NET 2.0 only really supports XHTML by default. Deciding to use HTML instead will mean fighting the framework. There isn’t a built in way to get it to render HTML syntax (that I could find – please leave a comment if I’m wrong). You can force ASP.NET 2.0 to render HTML 4.0 compatible syntax but it could lead to unpredictable problems because the framework is expecting to be using XHTML syntax.
On the other hand, the browser support for XHTML isn’t really there yet. IE6 and IE7 don’t really support XHTML in the same way other new browsers like Firefox and Opera do. IE can display XHTML webpages if they are sent from the web server as HTML documents (so if they have a mime-type of text/html) but it is really only displaying them as if they were HTML pages. This is technically allowed for XHTML 1.0 documents only but can be problematic. Other browsers will also display any XHTML pages that are sent as HTML pages as if they were regular HTML. It doesn’t trigger XML mode.
IE can’t display pages that are sent from the server as xml at all. Here’s what IE users will see if they try to view a page sent as xml:
To find out more about the differences between HTML and XHTML, there’s a great explanation on Sitepoint.
ASP.NET 2.0 generates XHTML by default and doesn’t have a mode for generating HTML 4.0 specific syntax (let me know in the comments if I’m wrong about this – I searched but couldn’t find a setting that would work).
If you decide that HTML 4.0 is a better fit for your site, you can use the ASP.NET 2.0 adapter model to create a custom HtmlTextWriter object that generates HTML 4.0 compliant markup. This has the advantage of giving you very fine grained control over the markup that is created by the elements on your page but it uses fairly advanced ASP.NET functionality and could potentially lead to unpredictable problems with code that assumes XHTML compliant syntax. This is not the path for an easy life, but it’s possible if you really want or need to use HTML 4.0.
The pragmatist in me says that fighting with the framework is ultimately a pretty futile thing to do. I think we’re probably stuck with XHTML syntax for this version of ASP.NET. Hopefully Microsoft will make it a choice in future versions, especially now HTML development will be continued with HTML 5.
Strict vs transitional
The main differences between the strict and transitional versions are the tags that are supported. Transitional versions tend to contain older tags that are retired in the strict version. For example it is perfectly legitimate to use an iframe tag in a HTML 4.0 transitional but not in HTML 4.0 strict.
Elements left out of the strict version tend to be tags that historically don’t tend to work consistently across different browsers (like the iframe) or tags that have been replaced by a new technology (like the font tag that has been replaced by css styling).
It doesn’t matter whether you’re using HTML or XHTML, you should be using the strict doctype where possible. The legacy tags in HTML 4.0 transitional will one day be retired and have been replaced by better ways to do things. Only use the transitional version if you need some of the elements that it supports.
Fortunately setting strict or transitional is pretty simple.
HTML generated by ASP.NET (like HTML generated by server controls) will automatically target the transitional doctype by default. You can configure it to target strict instead by setting the xhtml conformance property in your site’s web.config:
ASP.NET will not use the information in the webpage (the doctype) to decide whether to generate strict or transitional HTML. It’s up to the developer to make sure that the xhtml conformance property has the right setting for the webpages in the site.
Once you’ve chosen what type of HTML you want to use, you need to let the browser know by adding a doctype to your page. This is a statement you add to the first line of the webpage. It has the type of HTML/XHTML to use and a link to the DTD file that defines it.
Here’s a page with a HTML 4.0 strict doctype:
< !DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
Browsers are very particular about what doctypes they will accept. They must be completely accurate or the browser will use a partial standards mode or will act as though the page has no doctype at all. There’s a complete list of valid doctypes on the W3C website.
Browsers without a doctype will render in a backwards compatible mode called quirks mode that is not compatible across different browsers.