More Emacs, XML, & Unicode
More UI hacking for entering Unicode into XML documents in Emacs.
Anyone can do any amount of work provided it isn't the work he is supposed to be doing at the moment.
Inspired by some additional discussion of character entities on the
emacs-nxml-mode
listAside: Does the Yahoo Groups archive work for
anyone? Half the time I get host not found errors, half of the
remaining time I get completely empty documents, and at least half of
the time that still remains, I get advertisements that I can’t click
through. I’m not sure I’ve ever successfully read a message in their
archives.,
I hacked at my xmlchars.el
work a bit more and
produced XML Unicode.
XML Unicode improves on my previous efforts:
-
Added a function to insert characters by Unicode name. Don’t remember the ISO entity name for “triple prime”? No worries, hit C-t uOr whatever binding you added for
unicode-character-insert
type “trip<tab>pr<tab><enter>” and in it goes. -
Added a similar function for ISO entity names.
-
Added a glyph list. Inserting literal Unicode characters is great, if they display properly. If not, I’d rather see the numeric character reference.
-
If the character occurs in an XML name, then I need the real character even if I can’t see it. For those cases, each of the functions takes a prefix arg. In other words, C-u C-t u.
-
Adapted
sgml-input
so that it’s sensitive to the glyph list. My newxml-input
watches what you type and automatically replaces ISO entity names with appropriate characters.In other words, typing
é
automatically inserts an “é” while typing‴
inserts ‴ because I don’t have a glyph for it in my emacs setup. -
The ISO entity names are all table driven; you can use any mneumonics you like.
-
I added code to construct a real Emacs pull-down menu (in addition to or instead of the pop-up menu) for any special characters that you’d like to access that way.
Share and enjoy.
Comments
Norm,
Correct me if I'm wrong here, but this is dependent on quail, which is included in the leim package. This typically (always?) requires a variety of emacs with MULE support. Which in turn precludes it's use with Xemacs on Windows (according to http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=87661ovgfq.fsf@tleepslib.sk.tsukuba.ac.jp)
If this is correct: Bummer.
Norm,
I'm wondering whether it'd be possible to have the actual glyphs appear in the Unichar menu, next to the names.
Michael: I experimented briefly with getting the glyphs rendered in the menu and discovered that the font used in the menu seems not to support Unicode. Ah, the irony.
I don't see it mentioned anywhere, but if you load xmlunicode.el it may give an error "Symbol's function definition is void: caddr". This can be resolved by using "(require 'cl)". Thanks to the helpful folks on #emacs for pointing this out to me.
The script yahoo2mbox interfaces with Yahoo Groups and leaves you with a lovely, lovely mbox-format archive file. You can then use your favorite tools on it. Yahoo does limit the amount you can download at once, but you can run the program repeatedly on several days and get incremental increases. It's a huge win.