application/xhtml+xml?
July 14th, 2008
There’s a mini-war over what MIME type to serve XHTML as: either application/xhtml+xml or text/html.
First, a definition: MIME stands for Multipurpose Internet Mail Extensions. Even thought HTTP is not email, “HTTP requires that data be transmitted in the context of e-mail-like messages, even though the data may not actually be e-mail” (see the W3C protocol). MIME types are usually defined in the server’s .htaccess file. In a (fragile) nutshell, serving a file as either application/xhtml+xml or text/html tells the browser what to do with it: treat the file as XML (a standard with strict coding guidelines) or HTML (a standard with looser guidelines).
What this means is that an invalid, i.e. not strictly coded according to XML guidelines for well-formed documents, XHTML document gets chewed up and spit out by the browser as, well, broken. As James Edwards puts it: “the strictness of a validating XML parser may be too extreme for real-world use. For an invalid XML document, browsers did not even make an attempt to parse the document as best they could, as they would with HTML—instead, they just displayed a validation error and stopped.”
There’s a significant group of developers and other groups who argue that XHTML should only be served as application/xhtml+xml. The list: Ian Hickson, Anne van Kesteren, Mark Pilgrim and the WebKit team. The W3C makes it clear as well: XHTML documents should be served as application/xhtml+xml.
I’m working on a large web application that, from the beginning in 2007, has been using XHTML 1.0 Strict DTD served as text/html. Given the above links and recommendations, this is wrong. But, as Hamlet said, “there is nothing either good or bad but thinking makes it so.”
Other developers have taken the stance that serving XHTML as text/html is fine. Brad Fults, for example, finds Hickson’s argument, “wholly unconvincing” and details why. For example, “the biggest advantages to XHTML are its readability, uniformity, well-formedness as it pertains to authoring, and the consistency of the rendered DOM (which is also a result of any well-formed HTML document).” XHTML Validation, then, is the author’s responsibility, but not a burden on the browser, which (when serving as text/html will render the page as HTML. Fults again:
Is there a harmful issue here? No. Whether or not documents validate is a completely separate issue from the MIME type as which they are sent. It is up to authors alone to worry about validating their documents with a validator.
Back to James Edwards, he takes a stance that it matters, sort of. Serving as text/html ensures that users get a readable webpage even when code might be malformed or not supported by the browser. To lean on his phrasing, an XHTML page served this way can fail cleanly.
I’ve written all this up to determine what a project I’m working on should do in terms of DTD and MIME type. Right now, I’m working on Puma, a network device visualization web application that acts as a detailed catalog of nearly 15,000 devices (routers, switches, APs, etc.). Serving the application as application/xhtml+xml, in no uncertain terms, breaks it. Somewhere in the thousands of lines of code, there’s malformed XHTML. Maybe. As a device page starts to build, a small portion of it succeeds, but at a certain point early in the process, it borks. I’m curious if it’s a JavaScript problem. A lot, maybe 95% or likely more, of the page is built by jQuery and JavaScript that relies on a JSON document for content. It seems as if when the JSON document comes into play, the borking begins. But I could be wrong.
To sum up, I’m going to continue serving Puma as text/html until the borking can be pinpointed. This way Puma can fail cleanly and still take advantage of the well-formed document structure XHTML demands.