More Emacs, XML, & Unicode

Volume 6, Issue 92; 03 Oct 2003

More UI hacking for entering Unicode into XML documents in Emacs.

Anyone can do any amount of work provided it isn't the work he is supposed to be doing at the moment.

Robert Benchley

Inspired by some additional discussion of character entities on the emacs-nxml-mode listAside: Does the Yahoo Groups archive work for anyone? Half the time I get host not found errors, half of the remaining time I get completely empty documents, and at least half of the time that still remains, I get advertisements that I can’t click through. I’m not sure I’ve ever successfully read a message in their archives., I hacked at my xmlchars.el work a bit more and produced XML Unicode.

XML Unicode improves on my previous efforts:

  • Added a function to insert characters by Unicode name. Don’t remember the ISO entity name for “triple prime”? No worries, hit C-t uOr whatever binding you added for unicode-character-insert type “trip<tab>pr<tab><enter>” and in it goes.

  • Added a similar function for ISO entity names.

  • Added a glyph list. Inserting literal Unicode characters is great, if they display properly. If not, I’d rather see the numeric character reference.

  • If the character occurs in an XML name, then I need the real character even if I can’t see it. For those cases, each of the functions takes a prefix arg. In other words, C-u C-t u.

  • Adapted sgml-input so that it’s sensitive to the glyph list. My new xml-input watches what you type and automatically replaces ISO entity names with appropriate characters.

    In other words, typing &eacute; automatically inserts an “é” while typing &tprime; inserts &#x2034; because I don’t have a glyph for it in my emacs setup.

  • The ISO entity names are all table driven; you can use any mneumonics you like.

  • I added code to construct a real Emacs pull-down menu (in addition to or instead of the pop-up menu) for any special characters that you’d like to access that way.

Share and enjoy.



Correct me if I'm wrong here, but this is dependent on quail, which is included in the leim package. This typically (always?) requires a variety of emacs with MULE support. Which in turn precludes it's use with Xemacs on Windows (according to;lr=&amp;ie=UTF-8&amp;oe=UTF-8&amp;

If this is correct: Bummer.

—Posted by Alastair Rankine on 10 May 2004 @ 01:48 UTC #


I'm wondering whether it'd be possible to have the actual glyphs appear in the Unichar menu, next to the names.

—Posted by Michaelâ„¢ Smith on 21 Jun 2004 @ 02:21 UTC #

Michael: I experimented briefly with getting the glyphs rendered in the menu and discovered that the font used in the menu seems not to support Unicode. Ah, the irony.

—Posted by Norman Walsh on 27 Jun 2004 @ 08:58 UTC #

I don't see it mentioned anywhere, but if you load xmlunicode.el it may give an error "Symbol's function definition is void: caddr". This can be resolved by using "(require 'cl)". Thanks to the helpful folks on #emacs for pointing this out to me.

—Posted by Tadgh on 21 Nov 2004 @ 12:31 UTC #

The script yahoo2mbox interfaces with Yahoo Groups and leaves you with a lovely, lovely mbox-format archive file. You can then use your favorite tools on it. Yahoo does limit the amount you can download at once, but you can run the program repeatedly on several days and get incremental increases. It's a huge win.

—Posted by John Cowan on 06 Jan 2006 @ 07:18 UTC #

Comments on this posting are closed. Thank you, spammers.