<feed xmlns="http://www.w3.org/2005/Atom" xmlns:foaf="http://xmlns.com/foaf/0.1/"><title>norman.walsh.name: Comments on /2003/11/13/charent</title><link rel="alternate" type="text/html" href="http://norman.walsh.name/2003/11/13/charent"/><id>http://norman.walsh.name/2003/11/13/charent/comments.atom</id><updated>2012-02-13T05:16:42.592131Z</updated><entry><title>Comment 1 on /2003/11/13/charent</title><link rel="alternate" type="text/html" href="http://norman.walsh.name/2003/11/13/charent#comment0001"/><id>http://norman.walsh.name/2010/09/25/oauth#comment0001</id><published>2003-11-17T17:43:08Z</published><updated>2003-11-17T17:43:08Z</updated><author><name>David Carlisle</name><foaf:mbox_sha1sum>da39a3ee5e6b4b0d3255bfef95601890afd80709</foaf:mbox_sha1sum></author><content type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml">
    <p>I&amp;apos;m not sure that input support in emacs (or any other editor) really addresses this. I&amp;apos;ve been able to type the above using a UK keyboard and
its standard iso-accents suport since I started using emacs 18 sometime around 1987. That was/is entering latin1 bytes rather than a unicode encoding but the principle of a simple ascii based keyboarding producing non-ascii characters in the file is hardly new.</p>
<p>I think the main reason that people want "character entities" is for the reason that XML or TeX is "self describing". If I look at your document and see &amp;lt;foo&amp;gt;&amp;amp;pi;&amp;lt;/foo&amp;gt; then I know how to produce that on whatever system I have, but if you have used some funky input mechanism so you type pi and a pi character gets inserted, then even if my system shows that as a pi I may not know how to enter it, and you can&amp;apos;t tell me as you don&amp;apos;t know my system.</p>
<p>Using elements/attributes addresses this problem, but at some considerable cost in the size of the underlying 
dom/infoset (Early mathml drafts proposed &amp;lt;mchar name="rightarrow"/&amp;gt;
but there was negative comment on what happens to the dom in your browser if every character gets represented by an element node, and attribute node, and a few namespace nodes.</p>
<p>The problem with dtds is not only that they may not be allowed (soap) but they may not be read at all (mozilla) unless you put it all in the internal subset.</p>
<p>&amp;lt;x&amp;gt;&amp;amp;rightarrow;&amp;lt;/x&amp;gt;</p>
<p>is not well formed, but</p>
<p>&amp;lt;!DOCTYPE x SYSTEM="x-rubbish:not-here"&amp;gt;
&amp;lt;x&amp;gt;&amp;amp;rightarrow;&amp;lt;/x&amp;gt;</p>
<p>is well formed (but presumably not valid)  If undefined entities were wellformed, you would have a chance of passing fragments using them through an xml pipeline and so long as the end application knew what they were supposed to mean, things would work out. As is
you tend to die with a fatal parse error at the start of your pipeline, which is no fun.</p>
<p>It isn&amp;apos;t clear to me if it is too late to change this. I think it could perhaps have gone in xml 1.1 but it didn&amp;apos;t and 
now it may be better to live with the problem rather than try to fix it,
but that isn&amp;apos;t quite the same thing as saying that there isn&amp;apos;t a problem:-)</p>
<p>David</p>
  </div></content></entry></feed>

