<?xml version="1.0" encoding="UTF-8"?>
<essay xml:lang="en" version="5.0" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:gal="http://norman.walsh.name/rdf/gallery#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
<info>
    
    
    
    
    
    
    
    
    
    
<title>On identifiers</title><biblioid class="uri">http://norman.walsh.name/2006/09/05/identifiers</biblioid>
<volumenum>9</volumenum>
<issuenum>84</issuenum>
<pubdate>2006-09-05T11:39:32-04:00</pubdate>
<date>$Date: 2006-09-05 15:03:07 -0400 (Tue, 05 Sep 2006) $</date>
<author>
      <personname>
<firstname>Norman</firstname>
	<surname>Walsh</surname>
</personname>
    </author>
<copyright>
      <year>2006</year>
      <holder>Norman Walsh</holder>
    </copyright>
<abstract>
<para>More thoughts on identifiers. Names, that is, and addresses, of course.</para>
</abstract>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#TheWeb"/>
</info>

<para xml:id="p1">[This essay is effectively part of a conversation. You're more
likely to find it understandable if you've read
my earlier piece,
<link xlink:href="/2006/07/25/namesAndAddresses">Names and addresses</link>, and
<personname>
      <firstname>Stuart</firstname>
      <surname>Weibel</surname>
    </personname>’s
follow-up essays, starting with
<link xlink:href="http://weibel-lines.typepad.com/weibelines/2006/08/on_identifiers_.html">On Identifiers, Scholarship, and Spitoons</link>. —ed]</para>

<para xml:id="p2"><personname>
      <firstname>Stuart</firstname>
      <surname>Weibel</surname>
    </personname>
<link xlink:href="http://weibel-lines.typepad.com/weibelines/2006/08/on_identifiers_.html">responded
at length</link> to my latest posting about names and addresses. Happily, we seem to
be largely in agreement.
</para>

<para xml:id="p3">The most significant point of disagreement, I think, is over the value of
what Stuart calls
“<link xlink:href="http://weibel-lines.typepad.com/weibelines/2006/08/uncoupling_iden.html">pure identifiers</link>”. That is, identifiers that are
intentionally and  explicitly decoupled from any resolution mechanism.</para>

<para xml:id="p4">Part of this disagreement clearly stems from different
understandings about what it means to use an http URI. Using an http
URI does not require deployment of a web server or obligate the user
to provide representations for the identifiers created. It's definitely
desirable and useful to do so, but it isn't required. That's simply a fact.
</para>
<para xml:id="p5">Arguments
against http URIs based on the cost or inconvenience of maintaining
web infrastructure to support access to those URIs don't hold water. I
accept that there are some issues of user expectation here, but I don't
find those issues sufficient to warrant the invention or use of “pure
identifiers”.</para>

<para xml:id="p6">In particular, I observe that not deploying a web server for
your http URIs today doesn't preclude you from doing so tomorrow. So
an organization might decide that some of its identifiers had gained
such broad use that there was value in supporting access to
them.</para>

<para xml:id="p7">That general observation aside, Stuart makes
<link xlink:href="http://weibel-lines.typepad.com/weibelines/2006/08/planet_web_is_m.html">some
specific arguments</link>
in favor of pure identifiers:</para>

<para xml:id="p8">His first argument is that they are valuable for things that are
conceptual, in particular, to identify things that are language
independent and have different meanings in different cultural
contexts. He gives, as an example, two books with the same 
<wikipedia page="Dewey_decimal">Dewey Decimal</wikipedia> number, 959.7043:
</para>

<informalfigure>
<literallayout>Vietnamese War, 1961-1975 DDC/22/eng//959.7043
(English language version of DDC 22)

American War, 1961-1975 DDC/22/vie//959.7043
(Vietnamese language version of DDC 22)</literallayout>
</informalfigure>

<para xml:id="p9">and observes that “the distance between the abstractions
rendered in two languages is greater than mere translation.”</para>

<para xml:id="p10">Unfortunately, I think his further argument that “what this
identifier should resolve to is complicated and context dependent”,
arises because he's changed the question in mid-stream. “Which book does
the user want?” isn't the right question. The right question is, what does
“959.7043” identify?</para>

<para xml:id="p11">As near as I can tell, the answer
to that question is entirely independent of language and cultural
context: the Dewey Decimal number 959.7043 identifies the concept of
the <wikipedia page="Vietnam_war">Vietnam War</wikipedia>. For that
concept, I assert that
<uri>http://xmlns.com/wordnet/1.6/Vietnam_war</uri> is an equally good
identifier. Superior, in fact, because if I do a GET on that resource I
find that it's about “a prolonged war (1954-1975) between the
communist armies of North Vietnam who were supported by the Chinese
and the non-communist armies of South Vietnam who were supported by
the United States” whereas I had to rely on Google and heuristics to
determine that that's what 959.7043 is most likely about.</para>

<para xml:id="p12">Stuart next argues for pure identifiers for legacy assets.
He gives ISBN numbers as an example, but I think the <literal>isbn:</literal> scheme is
an historical
accident as much as anything else. Given a legacy identifier, “12345”,
there are a few of ways to imagine making a URI out of it. One is
“<uri>legacy-identifier-scheme:12345</uri>”. Another is
“<uri>http://legacy-identifier.org/12345</uri>”. I don't see any advantage
of the former over the latter.</para>

<para xml:id="p13">(With respect to ISBN numbers in particular, I observe that they
identify a non-intuitive resource. That resource is the set of all
books that have ever had that ISBN number. At least, that's the only
interpretation of an ISBN number that makes sense to me.)</para>

<para xml:id="p14">Finally, Stuart argues that there are business cases for
late-resolution-binding of identifier resolution. Perhaps.
Unfortunately, I don't think I really understand the examples given.</para>

<para xml:id="p15">One thread seems to to be about access control, the argument
apparently being that <uri>newscheme:something</uri> is better because
the resolution mechanism for that URI can manage whether or not the
user has authority to access that resource. But authority is an entirely
orthogonal issue. There are several ways to limit
access to <uri>http://example.com/something</uri>.
</para>

<para xml:id="p16">There may be circumstances under which there are compelling
reasons not to use http URIs, but no such circumstances have yet been
convincingly articulated to me.</para>

</essay>

