<?xml version="1.0" encoding="UTF-8"?>
<essay xml:lang="en" version="5.0" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:gal="http://norman.walsh.name/rdf/gallery#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
<info>
    
    
    
    
    
    
    
    
    
<title>“Default” XML Processing</title><biblioid class="uri">http://norman.walsh.name/2010/04/09/xmldefault</biblioid>
<volumenum>13</volumenum>
<issuenum>13</issuenum>
<pubdate>2010-04-09T11:27:38-04:00</pubdate>
<author>
      <personname>
<firstname>Norman</firstname>
	<surname>Walsh</surname>
</personname>
    </author>
<copyright>
      <year>2010</year>
      <holder>Norman Walsh</holder>
    </copyright>
<abstract>
<para>A look at the intersection of the XML model PI, the XML
stylesheet PI, and XProc.
</para>
</abstract>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#XML"/>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#XProc"/>
</info>

<para xml:id="p1">What is the “default XML processing model?” That question
has been open for a long time, since the very beginning of XML really.
There are a lot of different opinions, some of them captured in the
<link xlink:href="http://www.w3.org/">W3C</link>’s
<link xlink:href="http://www.w3.org/2001/tag/">Technical Architecture
Group</link> discussion of the issue
“<link xlink:href="http://www.w3.org/2001/tag/group/track/issues/34">xmlFunctions-34</link>”. (Disclaimer: I contributed to that issue while I was a member of
the TAG.)</para>

<para xml:id="p2">It's in the charter of the
<citetitle xlink:href="http://www.w3.org/XML/Processing/">XML Processing Model
Working Group</citetitle>, which I chair, to provide an answer to this
question. I don't think it has <emphasis>an</emphasis> answer. I don't
subscribe to the notion that XML documents have one and only one
intrinsic meaning. I think the best we can do is describe one (or a
few) possible models and give them labels. That will allow the authors
of other specifications, and applications, to say “we do
<replaceable>TYPEX</replaceable> processing on XML documents”, where
“<replaceable>TYPEX</replaceable>” is one of the
labels. That'll give us a shorthand for talking about some common processing
models.</para>

<para xml:id="p3">That may not seem very satisfying. Maybe it isn't. The point of
this essay isn't to make or defend that position. When the WG produces
it's first public working draft of a document that attempts to answer
the “default XML processing model” question, I'll let you know. The right
answer isn't what I think it is, it's what community consensus drives us to.</para>

<para xml:id="p4">No, the point of this essay is something else, something a little more
complicated than I think we would reasonably expect to put in that document
(though I could be wrong).</para>

<para xml:id="p5">The XML community has had an
<citetitle xlink:href="http://www.w3.org/TR/xml-stylesheet/">Associating
Style Sheets with XML documents</citetitle> specification for a long time.
It will soon have an
<citetitle xlink:href="http://www.w3.org/XML/2010/01/xml-model/">Associating
Schemas with XML documents</citetitle> specification. (That link is to an
early editor's draft, there's nothing official yet, but it's coming soon.)
</para>

<para xml:id="p6">What are the two most common things that many (not all!) users want
to do with XML documents? Validate them and transform them.</para>

<para xml:id="p7">Well, if the document tells you how to validate it and how to style it,
then isn't one possible answer to the default processing question simple:
validate like I say and style like I say?
If it is, shouldn't we be able to express that processing using an
<wikipedia page="XML_pipeline">XProc</wikipedia> pipeline?</para>

<para xml:id="p8">Of course we should.
And we can: <link xlink:href="examples/default.xpl">default.xpl</link>.
</para>

<para xml:id="p9">I'm happy and relieved to find that we <emphasis>can</emphasis>
express that processing in XProc. I'm a little, but only a little,
surprised to see how complex that pipeline is. Weighing in at 320+
lines it works for a few narrow cases. I still need to integrate
support for RELAX NG compact syntax schemas and NVDL processing, at
least. I may also want to support a few more stylesheet options, I'm
not sure.</para>

</essay>

