<?xml version='1.0' encoding='utf-8' standalone='yes'?>
<?xml-stylesheet type='text/xsl' href='/style/atom-comments.xsl'?>
<feed xmlns='http://www.w3.org/2005/Atom'>
<title>norman.walsh.name: Comments on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11'/>
<id>http://norman.walsh.name/2004/09/30/xml11/comments.atom</id>
<updated>2005-01-17T08:30:23Z</updated>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 1 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0001'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0001</id>
<published>2004-10-04T00:45:36Z</published>
<updated>2004-10-04T00:45:36Z</updated>
<author>
  <name>Elliotte Rusty Harold</name>
  <foaf:mbox_sha1sum>fcc9a2c3412d8d046a24619e3aa59dadeec7dc91</foaf:mbox_sha1sum>
  <uri>http://www.cafeconleche.org/</uri>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>
Norm, I'm really shocked at your statement that "More Unicode characters are allowed in text" in XML 1.1. I really thought you knew better than this. I expected the footnote would clarify your point, but it just made  it worse. Sadly there has been a lot of FUD spewed on this issue by people trying to justify XML 1.1. The benefits of XML 1.1 are so minscule and the costs so high, that the only way to justify it is by pretending it solves a problem that doesn't actually exist. 
</p>

<p>
Let's be clear about one thing: XML 1.1 enables <strong>NO</strong> new genuine characters in XML text. The only new characters added are a few C0 controls. Every single human legible character in Unicode 3.0, 4.0, and any future version is legal in text (element content, attribute values, processing instruction data, and comments) in XML 1.0. There is nothing in XML 1.0 that makes it inadequate for writing Amharic, Burmese, Thaana, Yi, Tengwar, or any other language that can be written with Unicode 3.0, 4.0, or later. 
</p>

<p>
The advances in XML 1.1 are solely about XML names. They have nothing to do with XML text, except for one tiny intersection of DTD validated ID-type attributes, but you don't think DTDs are very relevant so that's not a huge win. No languages are discriminated against by XML text. XML 1.1 might be useful to someone who wants to write ther markup (not their text but their markup) in Amharic, Burmese, Mongolian, Cambodian or any of a few other more obscure languages. Anybody who doesn't need to do this has nothing to gain from XML 1.1. And anybody who wants to write a Mongolian web page in XHTML or a Burmese technical manual in DocBook can do it just fine with XML 1.0.
</p></div></content>
</entry>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 2 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0002'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0002</id>
<published>2004-10-04T10:21:26Z</published>
<updated>2004-10-04T10:21:26Z</updated>
<author>
  <name>David Carlisle</name>
  <foaf:mbox_sha1sum>ac4bbb0ce3a3e02cc386fe410164dc831b49c1ce</foaf:mbox_sha1sum>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>"The rest of the XML 1.1 changes were either unnecessary or feature creep, IMHO, but they&#8217;re harmless."</p>

<p>The trouble is, they are not harmless, changing the white space rules in what was supposed to be a minor point increase was always going to inflict real pain. especially adding characters outside the ascii range to the white space set means that even if you know your doc is in utf8 you can't really use standard 8bit text processing tools very easily on the file (which after all is one of the main points of utf8).</p>

<p>If 1.1 had stuck to changing the name char rules to being workable with all future unicode versions not just 2.0 it would (or might) have had a better chance of success.</p></div></content>
</entry>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 3 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0003'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0003</id>
<published>2004-10-04T11:07:28Z</published>
<updated>2004-10-04T11:07:28Z</updated>
<author>
  <name>Norman Walsh</name>
  <foaf:mbox_sha1sum>9f5c771a25733700b2f96af4f8e6f35c9b0ad327</foaf:mbox_sha1sum>
  <uri>http://norman.walsh.name/</uri>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>You may consider me duly chastised, Elliotte. And a little red in the face. I shouldn't have gotten that wrong.

</p><p>You're absolutely right, of course.</p></div></content>
</entry>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 4 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0004'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0004</id>
<published>2004-10-05T05:52:03Z</published>
<updated>2004-10-05T05:52:03Z</updated>
<author>
  <name>MURATA Makoto</name>
  <foaf:mbox_sha1sum>33e044f6c4939b62abba7f691a979d476988ebf0</foaf:mbox_sha1sum>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>It is not hard to "allow a different suite of additional characters in names" for XML 1.1.  But this behaviour 
cannot be introduced to RELAX NG 1.0 without possibly breaking 
conformant implementations.</p>

<p>It is certainly possible to create RELAX NG 1.1 to 
address XML 1.1.  This is good for I18N and may be 
bad for the promotion of RELAX NG.  How do you feel? </p></div></content>
</entry>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 5 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0005'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0005</id>
<published>2004-10-07T08:18:36Z</published>
<updated>2004-10-07T08:18:36Z</updated>
<author>
  <name>Henri Sivonen</name>
  <foaf:mbox_sha1sum>a6bdebebeb696a316f7a4ded51f603d4d687b790</foaf:mbox_sha1sum>
  <uri>http://iki.fi/hsivonen/</uri>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>Allowing non-ASCII in element and attribute names seems politically correct, but are there known cases of people actually using non-ASCII element and attribute names with XML 1.0? All widely-used well-known vocabularies have all-ASCII element/attribute names.</p> 

<p>With many European languages there's the issue of NFC and NFD representations being different XML names. A bug induced by NFC vs. NFD difference in element/attribute names is not a bug I'd like to spend time tracking down.</p></div></content>
</entry>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 6 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0006'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0006</id>
<published>2004-10-10T16:06:42Z</published>
<updated>2004-10-10T16:06:42Z</updated>
<author>
  <name>MURATA Makoto</name>
  <foaf:mbox_sha1sum>33e044f6c4939b62abba7f691a979d476988ebf0</foaf:mbox_sha1sum>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>By the way, I am very sure that a lot of Japanese XML users 
heavily use Japanese characters for element and attribute 
names.  In fact, an XML project of a Japanese ministry 
(which I am involved in) will very heavily use Japanese 
names and will use almost no ASCII names.</p>

<p>So, I am not saying XML 1.1 is useless.  But, 
for RELAX NG to use XML 1.1, we need a new version of RELAX NG.</p></div></content>
</entry>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 7 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0007'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0007</id>
<published>2005-01-12T02:09:36Z</published>
<updated>2005-01-12T02:09:36Z</updated>
<author>
  <name>Aaron Winters</name>
  <foaf:mbox_sha1sum>628e51c21c986665556629459863a594876bfe63</foaf:mbox_sha1sum>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>Now, I may be jumping into the discussion a little late, but I'd like someone to clarify something. What the hell is RELAX NG and why should the  advancement of XML to version 1.1 have such a marked impact? I understand that if one used the newly supported Unicode characters in the XML names where they are restricted such as "element type names, attribute names, enumerated attribute values, processing instruction targets, and so on"<i>*</i> they documents would not be backwards compatible with an XML 1.0 processor. The only thing the Relax NG web site really told be was that it was a schema language for XML. I can understand why the new features of XML 1.1 would not allow it to validate with the current versions of RELAX NG or XML Schema 1.0; however, as the language evolves I see no reason why the schema languages built around it shouldn't evolve as well.</p>
<p>I can really relate to what MURATA Makoto said. I have worked with developers from Japan on a number of occasions and they are very adamant about using their native characters in XML names and anywhere else they can for that matter. Localization is becoming a real issue as more and more people move toward electronic documents. I don't see why anyone should be forced to work outside their native language and character set because the people who designed XML 1.0, XML Schema 1.0, and Relax NG didn't have the foresight to see beyond Unicode 2.0.</p></div></content>
</entry>

<entry xmlns:foaf='http://xmlns.com/foaf/0.1/'>
<title>Comment 8 on /2004/09/30/xml11</title>
<link rel='alternate' type='text/html' href='http://norman.walsh.name/2004/09/30/xml11#comment0008'/>
<id>http://norman.walsh.name/2004/09/30/xml11#comment0008</id>
<published>2005-01-17T08:30:21Z</published>
<updated>2005-01-17T08:30:21Z</updated>
<author>
  <name>Rick Jelliffe</name>
  <foaf:mbox_sha1sum>7d39c25b7742f5377780dc65df58dc2e431a245e</foaf:mbox_sha1sum>
</author>
<content type='xhtml'><div xmlns="http://www.w3.org/1999/xhtml"><p>On Aaron Winters comments:  when XML 1 was designed, it allowed all characters in data (except for characters that are special purpose characters used for driving teletype printers {such as the backspace control character} or with obscure Unicode semantics. However, it restricts the characters allowed in names to a sensible set, based on the version of Unicode available at the time.
</p><p>As Unicode is updated, there are several approaches possible for updating XML:</p>
<p>1) leave it alone</p>
<p>2) update the detailed lists of allowed characters </p>
<p>3) move to excluding characters that are unwanted and allowing all the others</p>
<p>4) move to use the Unicode properties for each character to decide which are good</p>
<p>The trouble with 1) is that XML needs to have best practise internationalization: the more central it is to computing, the
more that any limitations increase the burdon's on people who are
left out. On the other hand, there is a time for everything: I
would prefer if W3C went onto a timed release strategy, and 
told everyone to expect an updated spec every 5 years and to
expect to change to it. </p>
<p>The trouble with 2) is that then the XML WG has to put out updates
to XML to track Unicode, and they have better things to do than maintaining XML, apparantly. 
</p><p>The trouble with 3) is that you lose some fine-grain ability to detect transmission problems. And who is to say that Unicode may not
add an inapropriate character that you have to cater for independently anyway? On the other hand, it lets you work in ranges which may be more
efficient for computation.  3) is pretty much the route taken with XML 1.1.</p>
<p>4) is probably the way I would have liked. It takes decisions out of the hands of the XML Working Group. All that needs to happen is a slight change in understanding that a well-formed name means "well-formed according to the Unicode library of the particular platform we are using", which is certainly good enough now that Unicode 4 is so widely deployed. </p>


<p>On top of this comes the question of whether such a change should be enough force a version-up of the XML number: this version-up alone is enough to break almost all XML processing software independent of anything else.</p>
<p>The first thing the XML WG (or indeed, the W3C) should do IMHO is to have a workable version number policy for implementers: such as "<i>A version m.n processor must reject any document with an m greater than its own. A version m.n processor may reject any document with m less than its own. For documents with the same m as the processor, a version m.n processor must accept all documents with an n less than or equal its own, and will accept any document documents with an n greater than its own unless some other error is found.</i>"  In other words, allow an XML 1.0 processor to read an XML 1.1 document an only barf when something is actually wrong. At the moment, the version numbers is a bar to well-managed updates of standards.</p></div></content>
</entry>

</feed>
