<?xml version="1.0" encoding="UTF-8"?>
<essay xml:lang="en" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:gal="http://norman.walsh.name/rdf/gallery#">
<info>
    
    
    
    
    
    
<title>Infoset Equality</title><biblioid class="uri">http://norman.walsh.name/2004/05/19/infoset-equal</biblioid>
<volumenum>7</volumenum>
<issuenum>86</issuenum>
<pubdate>2004-05-19T06:55:00-04:00</pubdate>
<date>$Date: 2005-09-11 10:27:02 -0400 (Sun, 11 Sep 2005) $</date>
<author>
      <personname>
<firstname>Norman</firstname>
	<surname>Walsh</surname>
</personname>
    </author>
<copyright>
      <year>2004</year>
      <holder>Norman Walsh</holder>
    </copyright>
<abstract>
<para>From the Technical Plenary, a URI that got lost:
a quick “off-the-cuff” definition for XML chunk equality based on the Infoset.
</para>
</abstract>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#XML"/>
</info>

<para xml:id="p1">At the W3C
<link xlink:href="http://www.w3.org/2003/08/allgroupoverview.html">Technical Plenary</link>
in March, 2004, the
<link xlink:href="http://www.w3.org/XML/Core/">XML Core Working Group</link> 
and the
<link xlink:href="http://www.w3.org/2001/tag/">TAG</link> met to discuss the
“<link xlink:href="http://www.w3.org/2001/tag/issues.html?type=1#xmlChunk-44">XML
chunk</link>” issue.</para>

<para xml:id="p2">Part of that discussion was about what it means for two chunks of XML to
be equal. I banged up a quick “off-the-cuff” definition for equality based on
the <link xlink:href="http://www.w3.org/TR/xml-infoset/">Infoset</link>.</para>

<para xml:id="p3">I wanted to make the definition available during the meeting so
I dropped it into my “scratch space” on this site. That URI must have
made it into some record of the meeting because it turns up in my logs
occasionally. In the spirit of keeping URIs persistent, here is the
definition that I proposed:</para>

<programlisting>1. Document Information Items

Two document information items are equal if their [children]
properties are equal, ignoring processing instructions and comments.

2. Element Information Items

Two element information items are equal if the following properties
are equal:

  - [namespace name]
  - [local name]
  - [children]
  - [attributes]

Children are compared in order, attributes without respect to order.

3. Attribute Information Items

Two attribute information items are equal if the following properties
are equal:

  - [namespace name]
  - [local name]
  - [normalized value]

4. Character Information Items

Two character information items are equal if the following properties
are equal:

  - [character code]

5. Unparsed Entity Information Items

Two unparsed entity information items are equal if the following
properties are equal:

  - [name]
  - [system identifer]
  - [notation name]</programlisting>

<para xml:id="p4">It’s not a complete definition (there are a few more information items
that would have to be considered), it was just written as an
attempt to show that <emphasis>a definition</emphasis> based on the
Infoset <emphasis>could</emphasis> be written. If that seems like a self-evident
statement, well, all I can say is that it is sometimes useful at working group
meetings to say explicitly things that are self-evident.</para>

</essay>

