This is application/xml

Volume 7, Issue 8; 17 Jan 2004; last modified 08 Oct 2010

This essay is served as application/xml. If you have difficulty, you may prefer the text/html version: http://norman.walsh.name/2004/01/18/text-html

Nothing will ever be attempted, if all possible objections must be first overcome.

Dr. Johnson

This essay is served with the MIME type “application/xml”. If you’re reading this, then your browser did the right thing (or you worked fairly hard to get here, in which case, I thank you).

[Update: When I initially published this essay, several people pointed out that XHTML Media Types (referred to informatively from XHTML 1.0) recommends the MIME type “application/xhtml+xml” for XHTML documents. Without giving it much thought, I switched the MIME type I was using for this essay. Of course, this changed some of the behavior I was trying to measure, so that was a bad idea. Now I’ve created a separate application/xhtml+xml version of this essay. Sorry for the confusion.]

In the course of recent public debate on the appropriateness of recovering from well-formedness errors in XHTML, Mark Pilgrim caught me serving content that, at least by some metrics, purported to be XHTML and was not well-formed. (In fact, the server said it was “text/html” content so it was, by that metric, not in error, though it was clearly not “correct” either.)

I’m sticking to my guns and remaining a “draconian.” So David Carlisle challenged me to start serving my essays as “application/xml”. That would pretty much guarantee that the browser would report any lapses of concentration that caused me to accidentally generate markup that wasn’t well-formed.

In general, I expect that there are still far too many people using browsers that don’t know what to do with application/xml content to make it practical to serve it exclusively. But I’m not sure that’s true and it may in particular not be true you, my personal audience. (Have I thanked you recently for taking the time to read my ramblings? Thanks!)

So this essay is an experiment. If you can’t read this version of this essay but you can read the text/html version, let me know: tell me what browser you’re using and what platform, please. You can report successes too, of course.

I’ll keep a little summary of the statistics on the text/html page.

Lessons Learned

I’ve changed a few things in response to feedback. I think these will be improvements: [Last update: 19 Jan 2004.]

  • The recommended MIME type for XHTML is “application/xhtml+xml”.

  • But some browsers (notably IE) handle “application/xml” better than “application/xhtml+xml”.

  • A little content-negotiation is probably in order. Beyond that, the MathML folks have clearly blazed a trail for us to follow. (Thanks, guys!)

  • Don’t put a document type declaration on the XHTML pages, because IE goes off and reads it (Good for you, IE! An actual feature not a bug!)

  • Remove   entities. This is actually imperative now that I’ve removed the document type declaration (because entities with no possibility of a declaration makes the document not well-formed).

Comments

Well, Mozilla 1.6 renders both pages perfectly.

I have for quite some time served all my pages as application/xhtml+xml for browsers that claim to support this (currently only Mozilla, AFAICS), and this works very well.

It's easy to do this, using the content negotiation feature in Apache, as documented on this page: "http://www.christian-web-masters.com/articles/web_XHTML.html" (though I believe I found the technique first on this German page: "http://schneegans.de/tips/apache-xhtml/").

—Posted by Karl Ove Hufthammer on 19 Jan 2004 @ 07:32 UTC #

Minor entity problem in Safari 1.1.1: The   entities in the calendar on the right appear as-is (ie not resolved to a non-breaking space).

(BTW I support your draconian stand)

—Posted by Alastair Rankine on 19 Jan 2004 @ 09:06 UTC #

Interesting. In that previous comment I typed ampersand-n-b-s-p-semicolon, and when I clicked "preview comment", it showed up like that in the preview. But it also resolved it into an actual non-breaking space in the comment, which I didn't notice. And now it's still there.

(Hmm, there may be a way to break well-formedness of your document here...)

—Posted by Alastair Rankine on 19 Jan 2004 @ 09:18 UTC #

Hey Norm,

This one worked fine under Windows Mozilla 1.5 and Firebird 0.7. Opera 6.01 displayed all the text and links for the application/xml one but ignored the CSS stylesheet; Opera did use the stylesheet with the text/html one. Opera is up to 7.23 now, so it would be interesting to see if that version has the same problem with your app/xml version.

—Posted by Bob DuCharme on 19 Jan 2004 @ 01:12 UTC #

This page http://www.w3.org/TR/xhtml-media-types/ (I'd be happy to use an HTML anchor here :) says:

"'application/xhtml+xml' SHOULD be used for XHTML Family documents, and the use of 'text/html' SHOULD be limited to HTML-compatible XHTML 1.0 documents."

Also see http://www.w3.org/TR/xhtml-media-types/#summary and http://www.w3.org/TR/xhtml1/#guidelines

—Posted by Tobi on 19 Jan 2004 @ 04:40 UTC #

To report, the article works transparently in Konqueror 3.1.4. With the exception of subtle font issues (grrr, someday I *will* figure out fonts under Linux), the page looks almost identical under Opera 7.23. Not only that, but Opera provides that really nice navigation system if you provide the <link rel="next" ... /> and the like.

The blurred line between interpreting and presenting HTML and correctly processing xhtml is rather interesting:

"Don’t put a document type declaration on the XHTML pages, because IE goes off and reads it (Good for you, IE! An actual feature not a bug!)" - Norman Walsh, article body

Doesn't this beg for catalog support in those browsers that are actually going to resolve identifiers? I would think that as we ask, more and more, for our web browsers to be XML readers that they should be able to take advantage of resources like XML catalogs. As a user, I am quite curious to know what my various browsers do behind the scenes, but sometimes that documentation can be sparse on this front.

—Posted by John Clark on 19 Jan 2004 @ 05:12 UTC #

Mozilla 1.6 (my regular browser) displays this fine. IE6 fails, putting up a dialog that says "Internet Explorer cannot download application application-xml from norman.walsh.name. Internet Explorer was not able to open this internet site. The requested site is either unavailable or could not be found. Please try again later."

Jim

—Posted by Jim Ancona on 19 Jan 2004 @ 10:10 UTC #

OK, IE6 SP1 is showing this, but it thinks the encoding is "Western European" (rather than UTF-8), so the curly apostrophe shows up as "a-with-caret euro-sign trademark-sign". And then when I clicked the talkback button, the page for submitting the talkback *is* recognized as UTF-8.

—Posted by Seth Gordon on 20 Jan 2004 @ 01:20 UTC #

Hi Norm,

this works just great in Mozilla Firebird 0.7 on Solaris 9. FWIW, I strongly support your "draconian" approach as well.

Cheers,

--Scott

—Posted by Scott Hudson on 20 Jan 2004 @ 03:41 UTC #