<?xml version='1.0' encoding='utf-8'?>
<?xml-stylesheet href="/style/browser.xsl" type="text/xsl"?>
<essay xmlns="http://docbook.org/ns/docbook"
       xmlns:xlink="http://www.w3.org/1999/xlink"
       xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
       xmlns:dc='http://purl.org/dc/elements/1.1/'
       xmlns:dcterms="http://purl.org/dc/terms/"
       xmlns:gal='http://norman.walsh.name/rdf/gallery#'
       version='pto'>
<info>
<title>Explaining identifiers in XML</title>
<volumenum>8</volumenum>
<issuenum>18</issuenum>
<pubdate>2005-02-11T09:46:01-05:00</pubdate>
<date>$Date: 2005-09-11 10:27:02 -0400 (Sun, 11 Sep 2005) $</date>
<author><personname>
<firstname>Norman</firstname><surname>Walsh</surname>
</personname></author>
<copyright><year>2005</year><holder>Norman Walsh</holder></copyright>
<abstract>
<para>Scenes from a possible future in which Norm tries to buy
groceries and explain XML at the same time.</para>
</abstract>
</info>

<para xml:id='p0'>[Update: <personname><firstname>David</firstname>
<surname>Megginson</surname></personname> pointed out that I may
not have given enough context for this essay. The
<link xlink:href="http://www.w3.org/TR/xml-id/">xml:id
Candidate Recommendation</link> draft has just been published, but
a storm of controversy has erupted. It started in a couple of 
threads on the public xml:id comments list
<link xlink:href="http://lists.w3.org/Archives/Public/public-xml-id/2005Feb/thread.html">in February</link>
and eventually
<link xlink:href="http://lists.w3.org/Archives/Public/www-tag/2005Feb/0017.html">spilled over</link>
onto the
<link xlink:href="http://www.w3.org/2001/tag/">TAG</link>’s discussion list
where a
<link xlink:href="http://lists.w3.org/Archives/Public/www-tag/2005Feb/0015.html"> formal request</link>
has been made for the TAG to take up the issue. This morning,
<personname><firstname>Elliotte Rusty</firstname>
<surname>Harold</surname></personname>
<link xlink:href="http://www.cafeconleche.org/#news2005February9">noted</link>
the CR draft's publication and voiced his support for
“<tag class="attribute">xmlid</tag>” as a way to resolve the
controversy (he's not alone, though I don't yet detect a majority of support
for that resolution). In any event, below, I ponder how that
solution might play out…]</para>

<para xml:id='p1'>So I'm standing in the checkout line at the local
supermarket and this guy walks up to me, “hey,” he says, “you're Norm,
aren't you? We met once before and you were telling me about this
markup stuff. I've been giving it a try and it seems pretty great,
but I have a few questions,
can you help me?”</para>

<para xml:id='p2'>“I can try. What would you like to know?”</para>

<para xml:id='p2b'>“Well,” he says, “I've been writing this paper and it's
almost entirely in English, but I managed to work the phrase ‘C'est
la vie’ into it. Should I put markup around that to indicate that
it's in French?”</para>

<para xml:id='p3'>“Yeah, that's probably a good idea.”</para>

<para xml:id='p4'>“Ok, how do I do that?”</para>

<para xml:id='p5'>“It's easy, you put an element around it, maybe
‘phrase’ or ‘span’ or whatever seems appropriate in your vocabulary,
and then you set the XML colon lang attribute to ‘fr’ on that
element.”</para>

<para xml:id='p6'>“Cool. I've also got some poetry. Most of the
paragraphs in my document get reformatted by the rendering
application, but I want to make sure that the line breaks don't move
in my poetry. Can I do that?”</para>

<para xml:id='p7'>“Well, that really should be part of the semantics of your
‘poem’ element or whatever your using to markup your poetry, but
if you want to make sure every processor knows that white space is
significant in your poems, you can set the XML colon space attribute
to ‘preserve’ to do that.”</para>

<para xml:id='p8'>“Use XML colon space, ok I think
I can remember that. It looks like the guy in front of you is having trouble
finding his check book, can I ask you a couple more questions?”</para>

<para xml:id='p9'>“Sure.”</para>

<para xml:id='p10'>“Thanks. I've been learning to use XInclude. Why did
it take five years to finish that spec?”</para>

<para xml:id='p11'>“I don't want to talk about it.”</para>

<para xml:id='p12'>“Oh. Sorry. That wasn't really my question anyway,
I was just curious. What I really wanted to say was that I've noticed
that when XInclude merges files together, it puts an XML colon base
attribute on the parts it pulls in. That seems to indicate what the
path to the original document was, is that right?”</para>

<para xml:id='p13'>“Basically, yes.”</para>

<para xml:id='p14'>“Neat. So can I use that myself? One part of my paper
has a whole bunch of pictures and it's really tedious to type the great
big long URI for each one of them. Can I put my own XML colon base
attribute in there and then just
use relative URIs for all the graphics?”</para>

<para xml:id='p15'>“Yep, that ought to work as long as the XML colon base
is on some element that's an ancestor of all the image elements. And as
long as it makes sense for that to be the base URI for <emphasis>all</emphasis>
the relative URIs under it.”</para>

<para xml:id='p16'>“Sweet. Ok, one last question. I want to put my documents
on the web and I'd like folks to be able to point into them with anchors
like they can with HTML. Can I do that?”</para>

<para xml:id='p17'>“Well, there's a sort of technically correct answer to 
that question and a practically correct answer. The technically correct answer
is ‘no’ because we're still waiting for a couple of specs to get finished.
Right now there's no official fragment identifier syntax for XML documents;
that part after the hash is the fragment identifier, but don't fret too much
about that right now. It seems pretty clear that this is all going to work
itself out and you'll be able to use “hash ID” to point to the element with
that ID in your document. So, yeah, you can, just put IDs on the things you
want to be able to point to.”</para>

<para xml:id='p18'>“I'm not using a DTD or anything, so I don't really have
any IDs. How do I do that?”</para>

<para xml:id='p19'>“That's what the XML ID attribute is for. Put an XML ID
on each element you want to be able to point to and make sure that it has
a unique value in your document. Oh, and that the value is a ‘name’
in the XML sense, without a colon.”</para>

<para xml:id='p20'>“You said ‘XML ID’ but you meant ‘XML colon ID’,
right?”</para>

<para xml:id='p21'>“No, that one doesn't have a colon.”</para>

<para xml:id='p22'>“What? Why not?”</para>

<para xml:id='p23'>“That's a long story. Basically, there was a bug in another
spec and everyone decided that it was too expensive to fix the bug because
there was so much deployed software that used it. Plus a lot of it was
security-related software and changing that stuff is really hard.”</para>

<para xml:id='p24'>“But that seems really confusing. Wouldn't it have made
more sense to fix the bug and keep everything consistent? Besides, isn't
that like the story <personname><firstname>Michael</firstname>
<surname>Sperberg-McQueen</surname></personname> was telling me the other
day about the ‘creat’ system call in Unix? They figured out just about as
soon as they started using it that they ought to have spelled it correctly,
‘create’, but they decided there was too much legacy software that relied
on it. There were a grand total of six machines in the world running Unix
at the time. There's never going to be less legacy than there is today.
And isn't this a horrible precedent to set?”</para>

<para xml:id='p25'>“What can I say,” I said with a shrug, “that's what the
XML community decided they wanted to do.”</para>

<para xml:id='p26'>“But that's so confusing. I thought I could use any
unqualified names I wanted. I know they suck, but I thought we were
supposed to use namespaces to identify things.”</para>

<para xml:id='p27'>“Well, yeah, but it turns out that the W3C grabbed
all the names and all the prefixes that begin with ‘x’, ‘m’, ‘l’ in any
case, so the name “<literal>xmlid</literal>” was never actually available
for you to use.”</para>

<para xml:id='p28'>“Ok, but this is so confusing, does that mean I can
use XML lang, XML space, and XML base, without the colons, if I want?”</para>

<para xml:id='p29'>“No, those are reserved too, but they don't mean anything.
So you have to use the colon for those.”</para>

<para xml:id='p30'>“This sucks. All because there was a bug in
another spec? Was it really a bug, or was it maybe not a bug?”</para>

<para xml:id='p31'>“Everyone pretty much seems to agree that it was a bug.
Anyway, wanna hear the best part? Now, if we agree some other
attribute should apply to all of XML, we can put a colon in it
if it ‘inherits’ but not if it doesn't. So you'll just have to remember which
is which. Or maybe we'll never be able to use the XML namespace again, who
knows?”</para>

<para xml:id='p32'>“But wait, XML colon base doesn't really inherit. Why is
that ok?”</para>

<para xml:id='p33'>“Oh, it's not. So the other spec is still potentially broken
in some subtle ways.”</para>

<para xml:id='p34'>“But in that case, you didn't really gain anything by
using XML ID without the colon, did you? I mean, it's all inconsistent and
confusing now, but the software that has the bug <emphasis>still</emphasis> has
a bug and has to be fixed anyway, right?”</para>

<para xml:id='p35'>“As far as I can see, yeah. Maybe I'm wrong though.”</para>

<para xml:id='p36'>“Don't you guys care about users at all?”</para>

<para xml:id='p37'>“I do my best, pal, that's all I can say.”</para>

<para xml:id='p38'>The cashier interrupted us at that point, “will that
be paper or plastic?” he asked, and we parted company. Last I saw, he
was walking back towards the aisle where they keep the aspirin. Me,
I'm going next door to the liquor store.</para>

</essay>
