<?xml version="1.0" encoding="UTF-8"?>
<essay xml:lang="en" version="lillet" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:gal="http://norman.walsh.name/rdf/gallery#">
<info>
    
    
    
    
    
    
    
    
    
<title>Canonical XML and xml:id</title><biblioid class="uri">http://norman.walsh.name/2005/09/14/xmlid</biblioid>
<volumenum>8</volumenum>
<issuenum>119</issuenum>
<pubdate>2005-09-14T16:56:35-04:00</pubdate>
<date>$Date: 2005-09-14 19:24:32 -0400 (Wed, 14 Sep 2005) $</date>
<author>
      <personname>
<firstname>Norman</firstname>
	<surname>Walsh</surname>
</personname>
    </author>
<copyright>
      <year>2005</year>
      <holder>Norman Walsh</holder>
    </copyright>
<abstract>
<para>It's not really news anymore, but xml:id is now a Recommendation.
Alas, it's just a little too early to declare victory. The job won't
be completely finished until we address XML Canonicalization.
</para>
</abstract>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#W3C"/>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#XML"/>
</info>

<epigraph>
<para xml:id="p2">A program is only complete when its last user is dead.</para>
</epigraph>

<para xml:id="p1">It's not really
<link xlink:href="http://www.w3.org/News/2005#item124">news</link>
anymore I suppose, but
<link xlink:href="http://www.w3.org/TR/xml-id/">xml:id Version 1.0</link>
is now a
<link xlink:href="http://www.w3.org/2004/02/Process-20040205/tr.html#RecsW3C">Recommendation</link>.
Alas, it's just a little too early to declare victory. The
job won't be completely finished until we address the problems
revealed in
<link xlink:href="http://www.w3.org/TR/xml-c14n">Canonical XML</link>
(C14N).</para>

<para xml:id="p3">In case you aren't an XML hack, or you happened not to have noticed
<link xlink:href="http://lists.w3.org/Archives/Public/public-xml-id/2005Jan/0037.html">the discussion</link>,
the problem in brief is this:
C14N specifies that when you canonicalize a document subset,
“attributes in the <literal>xml</literal> namespace,
such as <literal>xml:lang</literal> and <literal>xml:space</literal>”
are inherited. That means if you start with a document like this one:</para>

<programlisting>&lt;doc&gt;
  &lt;wrapper xml:space="preserve"&gt;
    Some content.
    &lt;nested&gt;More content&lt;/nested&gt;
  &lt;/wrapper&gt;
&lt;/doc&gt;</programlisting>

<para xml:id="p4">and apply C14N to the element named “nested”, its canonical form
is:</para>

<programlisting>&lt;nested xml:space="preserve"&gt;More content&lt;/nested&gt;</programlisting>

<para xml:id="p5">That works fine for <tag class="attribute">xml:lang</tag> and
<tag class="attribute">xml:space</tag> which have inheritable semantics.
It doesn't work at all for
<tag class="attribute">xml:id</tag>: an ID is associated with a particular
element, it does not inherit.</para>

<para xml:id="p6">Anyway, whether you think <tag class="attribute">xml:id</tag>
was a bad idea or not is water under the bridge at this point.
Critically, for me anyway, is the fact that C14N was
<emphasis>already broken</emphasis> before <tag class="attribute">xml:id</tag>
came along. The <tag class="attribute">xml:base</tag> attribute doesn't
inherit in the same way as <tag class="attribute">xml:lang</tag> and
<tag class="attribute">xml:space</tag> either and 
<link xlink:href="http://www.w3.org/TR/xmlbase/">XML Base</link> was already a
Recommendation when we started working on
<tag class="attribute">xml:id</tag>.</para>

<para xml:id="p7">The open question at this point is, how should C14N be fixed? For 
<tag class="attribute">xml:id</tag> and any other attributes that may
be added to the <code>xml</code> namespace in the future, the answer is
clear: C14N must not treat them as inheritable. What's less clear to me
is what C14N should do with <tag class="attribute">xml:lang</tag>,
<tag class="attribute">xml:space</tag>, and
<tag class="attribute">xml:base</tag>. I think it breaks down into two
cases, each with two possibilities:</para>

<orderedlist>
<listitem>
<para xml:id="p8">How should <tag class="attribute">xml:lang</tag> and
<tag class="attribute">xml:space</tag> be handled?</para>
<orderedlist>
<listitem>
<para xml:id="p9">They should not be inherited.</para>
</listitem>
<listitem>
<para xml:id="p10">They should be inherited just as they are now.</para>
</listitem>
</orderedlist>
</listitem>
<listitem>
<para xml:id="p11">How should <tag class="attribute">xml:base</tag> be handled?</para>
<orderedlist>
<listitem>
<para xml:id="p12">It should not be inherited.</para>
</listitem>
<listitem>
<para xml:id="p13">It should be subjected to “fixup” and then inherited. By fixup, I mean
that the correct absolute value should be used, rather than the literal
value of the attribute as the current specification indicates.</para>
</listitem>
</orderedlist>
</listitem>
</orderedlist>

<para xml:id="p14">We talked about this a little bit on the
<link xlink:href="http://www.w3.org/XML/Core/">XML Core WG</link> telcon
today and I think
<personname>
      <firstname>Daniel</firstname>
<surname role="suppress">Veillard</surname>
    </personname> convinced
me that the right answers are “1b” and “2b”; his argument being the
principle of least surprise.</para>

<para xml:id="p15">What do you think the right answers are?</para>

</essay>

