<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="/style/atom.xsl"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:dcterms="http://purl.org/dc/terms/"
      xml:lang="EN-us">
   <title>Norman.Walsh.name</title>
   <subtitle>
Norm's musings. Make of them what you will.
</subtitle>
   <link rel="alternate" type="text/html" href="http://norman.walsh.name/"/>
   <link rel="self" href="http://norman.walsh.name/atom/whatsnew-fulltext.xml"/>
   <id>http://norman.walsh.name/atom/whatsnew.xml</id>
   <updated>2010-01-25T23:12:04Z</updated>
   <author>
      <name>Norman Walsh</name>
   </author>
   <entry>
      <title>XML FTW!</title>
      <link rel="alternate" type="text/html" href="http://norman.walsh.name/2010/01/25/xml"/>
      <id>http://norman.walsh.name/2010/01/25/xml</id>
      <published>2010-01-25T22:21:37Z</published>
      <updated>2010-01-25T23:12:04Z</updated>
      <dc:subject>Software</dc:subject>
      <category term="xml" scheme="http://technorati.com/tag/"/>
      <dc:subject>XML</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>On the serendipitous joy of finding XML.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2010/01/25/xml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>On the serendipitous joy of finding XML.</p>
            </div>
            <p id="p1">As I've <a href="http://norman.walsh.name/2009/11/01/evernote#p7" shape="rect">said</a> 
               <a href="http://norman.walsh.name/2008/12/08/whichEndIsUp#p6" shape="rect">before</a>, I'm <em>very reluctant</em> to use your application if it's a roach motel for <em>my</em> data. It would not be fair to say that I'll <em>refuse</em> to use your application, it's just a lot less likely.</p>
            <p id="p2">For example, when it came to <a href="http://norman.walsh.name/2010/01/25/gsd" title="GSD!" shape="rect">GSD</a>, I decided that open access wasn't as important as picking an application that I'd actually use. If I let myself get distracted by exploring APIs, there'd be other things not getting done! (Priorities!)</p>
            <p id="p3">Having made my bed, I figured I should see what I was lying in. Today I took a peek at how <a href="http://en.wikipedia.org/wiki/OmniFocus" title="Wikipedia: OmniFocus"
                  shape="rect">OmniFocus</a> stores data. Now, the title of this essay no doubt gives away the punch line, so consider for a moment how this would have been done in the time before XML.</p>
            <p id="p4">
               <em>…go on, have a think, I'll wait…</em>
            </p>
            <p id="p5">In my experience it would probably have been in some proprietary format, almost certainly binary, and utterly opaque. How many tools document(ed) their proprietary data formats? On some platforms, there might have been system services for storing data, some sort of platform-supported database perhaps. Those systems are (often) only marginally better. They produce, instead of an opaque stream of bits, an opaque stream of atomic values. (Don't get me wrong, I've done the reverse-engineering thing
on binary formats, I'd prefer the stream of atomic values, believe you me.)</p>
            <p id="p6">What did I find when I went looking at the OmniFocus data? A directory full of ZIP files. And what's in each ZIP file? Why <tt class="filename">contents.xml</tt>, of course!</p>
            <p id="p7">Now, it would not be fair to assert that this is perfectly transparent. XML isn't magic. There are clearly some cross-reference relationships in there that will take a little mental gymnastics to decode. But still, I'll trade this:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
...
&lt;task id="pJhk6REkEHC" op="update"&gt;
  &lt;task idref="ggQv63WgCbw"/&gt;
  &lt;added&gt;2010-01-21T16:23:08.983Z&lt;/added&gt;
  &lt;modified&gt;2010-01-24T21:01:41.632Z&lt;/modified&gt;
  &lt;name&gt;Add server-side support for multipart MIME to tests.xproc.org&lt;/name&gt;
  &lt;rank&gt;2113929216&lt;/rank&gt;
  &lt;context idref="jYnYAAVroBT"/&gt;
  &lt;due&gt;2010-01-27T22:00:00.000Z&lt;/due&gt;
  &lt;completed&gt;2010-01-24T21:01:41.622Z&lt;/completed&gt;
  &lt;order&gt;parallel&lt;/order&gt;
&lt;/task&gt;
...
</pre>
            </div>
            <p id="p8">for <em>anything</em> I would <em>ever have gotten</em> at <em>any other point</em> in the history of file formats!</p>
            <p id="p9">XML has its detractors. It would not be fair to say they are all wrong. But I'll take XML over fair any day!</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>GSD!</title>
      <link rel="alternate" type="text/html" href="http://norman.walsh.name/2010/01/25/gsd"/>
      <id>http://norman.walsh.name/2010/01/25/gsd</id>
      <published>2010-01-25T15:13:26Z</published>
      <updated>2010-01-25T19:37:50Z</updated>
      <category term="osx" scheme="http://technorati.com/tag/"/>
      <dc:subject>OSX</dc:subject>
      <dc:subject>Software</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Our engineering department has a project management philosophy they describe as GSD. I aspire to GSD.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2010/01/25/gsd">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Our engineering department has a project management philosophy they describe as GSD. I aspire to GSD.</p>
            </div>
            <p id="p1">For me, the part of GSD<sup class="footnote">[<a name="p1.1" href="#ftn.p1.1" id="p1.1" shape="rect">1</a>]</sup> that I most often have difficulty with is keeping track of what needs doing. My todo (or want-todo) list is absurdly long. If I feel like castigating myself, I can always find a few things on my list that <em>should</em> have been done by now. It's not that I don't work hard or get a lot done, it's that I don't always prioritize perfectly and sometimes things slip through the cracks.</p>
            <p id="p3">I've been trying to get better at this. Having an online calendar sync'd with my phone keeps me from accidentally missing meetings and phone calls, so it seems to follow that some sort of online system should be able to help me with my todo list.</p>
            <p id="p4">My requirements are pretty simple: I want something that's easy to use and I want something that syncs with my mobile device. An online tool is almost, but not quite, as good as something that I can use offline on my PDA.</p>
            <p id="p5">I don't subscribe to any particular <a href="http://en.wikipedia.org/wiki/Getting%20Things%0ADone"
                  title="Wikipedia: Getting Things Done"
                  shape="rect">Getting Things Done</a> methodology. Maybe I'll get there someday, but that's not my immediate goal.</p>
            <p id="p6">I played with <a href="http://www.rememberthemilk.com/" shape="rect">Remember The Milk</a> on-and-off last year. It seemed to work pretty well for simple lists, but I wasn't using it consistently because, I think, it wasn't quite powerful enough.</p>
            <p id="p7">This month, I took a few different systems for a test drive: <a href="http://www.2doapp.com/en/2Do/overview.html" shape="rect">2Do</a>, <a href="http://www.toodledo.com/" shape="rect">Toodledo</a>, <a href="http://culturedcode.com/things/" shape="rect">Things</a>, and <a href="http://www.omnigroup.com/applications/omnifocus/" shape="rect">OmniFocus</a>.</p>
            <p id="p8">Unfortunately, 2Do is only an iPhone app. It appears that there are plans for the next version to support syncing with Toodledo, but that doesn't exist today. Toodledo is a web-based app and is quite nice, probably plenty sufficient for my needs. On the desktop front, both <a href="http://en.wikipedia.org/wiki/Things_%28application%29"
                  title="Wikipedia: Things (application)"
                  shape="rect">Things</a> and <a href="http://en.wikipedia.org/wiki/OmniFocus" title="Wikipedia: OmniFocus"
                  shape="rect">OmniFocus</a> are probably
plenty sufficient as well. (There are no doubt other similar applications, those are just the ones I happened to try. I didn't attempt an exhaustive survey, I've GStD!)</p>
            <p id="p9">And the winner is: OmniFocus, by a narrow margin. I like the project/context duality that OmniFocus uses (ToodleDo has contexts too, if you turn them on). Mostly it boiled down to the UI: I liked the “feel” of OmniFocus best.</p>
            <p id="p10">This is an app I plan to <em>force myself to use</em>, so I figured I'd best pick one that felt good. It's also the most expensive, by a pretty wide margin, but c'est la vie.</p>
            <p id="p11">Will this really work for me? Time will tell. But so far, so good. And I'm already learning to use it in ways I hadn't planned: maintaining shopping lists and travel check lists. Those aren't the sorts of things for which I would have actively sought out software (sometimes a pencil and a piece of paper really is enough), but it's encouraging to me that I have other reasons to be paying attention to my GSD tool.</p>
            <div class="footnotes">
               <hr width="100" align="left" class="footnotes-divider"/>
               <div class="footnote">
                  <p id="p2">
                     <sup>[<a href="#p1.1" name="ftn.p1.1" id="ftn.p1.1" shape="rect">1</a>]</sup>Getting Shi␈␈␈Stuff Done!</p>
               </div>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>XML Prague 2010</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2010/01/24/xmlprague"/>
      <id>http://norman.walsh.name/2010/01/24/xmlprague</id>
      <published>2010-01-24T18:27:40Z</published>
      <updated>2010-01-24T18:47:00Z</updated>
      <category term="docbook" scheme="http://technorati.com/tag/"/>
      <dc:subject>DocBook</dc:subject>
      <category term="xmlprague" scheme="http://technorati.com/tag/"/>
      <dc:subject>XMLPrague2010</dc:subject>
      <category term="xproc" scheme="http://technorati.com/tag/"/>
      <dc:subject>XProc</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>See you at XML Prague! And a chance to plug some really excellent training.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2010/01/24/xmlprague">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>See you at XML Prague! And a chance to plug some really excellent training.</p>
            </div>
            <p id="p1">I'm delighted that my paper proposal for <a href="http://www.xmlprague.cz/2010/" shape="rect">XML Prague</a> was accepted. I'm a little less delighted that the final paper deadline was last week, but I guess that's encouragement to finish it, eh? (I will, I promise.)</p>
            <p id="p2">I'm going to speak about modular documentation in DocBook, both about the proposal for “assemblies” being developed in the <a href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=docbook"
                  shape="rect">DocBook Technical Committee</a> and about my <a href="http://xproc.org/" shape="rect">XProc</a>-based implementation.</p>
            <p id="p3">I mention these things for only two reasons<sup class="footnote">[<a name="p3.1" href="#ftn.p3.1" id="p3.1" shape="rect">1</a>]</sup>: first, to recommend that <a href="http://www.xmlprague.cz/" shape="rect">XML Prague</a> is a conference you need to go to if you're interested in XML technologies. It's that good.</p>
            <p id="p5">Second, to plug <span class="personname">
                  <span class="firstname">G. Ken</span> 
                  <span class="surname">Holman</span>
               </span>’s <a href="http://www.cranesoftwrights.com/index.html#Crane201003CZ" shape="rect">XSLT/XPath 1.0 &amp; 2.0 and XQuery 1.0 Hands-on Training</a> class. If you're looking for someone to teach you XSLT and XQuery, you'd be hard pressed to do better than Ken. And if you're interested in XML, these are technologies you <em>need</em> to know. The maximum class size is an astonishing
<em>six</em>, so it's practically 1:1. Yet another reason to be in Prague in March</p>
            <p id="p6">See you there!</p>
            <div class="footnotes">
               <hr width="100" align="left" class="footnotes-divider"/>
               <div class="footnote">
                  <p id="p4">
                     <sup>[<a href="#p3.1" name="ftn.p3.1" id="ftn.p3.1" shape="rect">1</a>]</sup>In the interest of full disclosure, I should point out that <a href="http://www.marklogic.com/" shape="rect">Mark Logic</a> is a gold sponsor of the conference and Ken's course is being delivered in partnership with our own <a href="http://www.marklogic.com/services/training.html" shape="rect">training services</a>. I don't think that makes me baised, but I guess I wouldn't, would I?</p>
               </div>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>NYMUG: Cloud deployment options</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2010/01/24/nymug"/>
      <id>http://norman.walsh.name/2010/01/24/nymug</id>
      <published>2010-01-24T18:16:47Z</published>
      <updated>2010-01-24T18:24:50Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Denise Miura, Sr. Director of Product Management will be speaking about Mark Logic's new offering for the Cloud at our upcoming User Group in New York this Wednesday.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2010/01/24/nymug">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Denise Miura, Sr. Director of Product Management will be speaking about Mark Logic's new offering for the Cloud at our upcoming User Group in New York this Wednesday.</p>
            </div>
            <p id="p1">The second Mark Logic New York User Group meeting will be held on Wednesday evening, 27 January 2010, hosted by <span class="personname">
                  <span class="firstname">Steve</span> 
                  <span class="surname">Kotrch</span>
               </span> from Simon &amp; Schuster!</p>
            <div class="variablelist">
               <dl>
                  <dt id="R.1.3.1.1">What</dt>
                  <dd>
                     <p id="p2">An opportunity to learn more about <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> and collaborate with other MarkLogic users.</p>
                  </dd>
                  <dt id="R.1.3.2.1">When</dt>
                  <dd>
                     <p id="p3">Wednesday, 27 January 2010, at 6:00pm EST.</p>
                  </dd>
                  <dt id="R.1.3.3.1">Where</dt>
                  <dd>
                     <p id="p4">
                        <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=1230+Avenue+of+the+Americas,+New+York,+NY&amp;sll=37.0625,-95.677068&amp;sspn=54.005807,51.416016&amp;ie=UTF8&amp;hq=&amp;hnear=1230+Avenue+of+the+Americas,+New+York,+10020&amp;z=17"
                           shape="rect">1230 Avenue of the Americas</a>, New York, NY between 48th and 49th streets on Sixth Avenue.</p>
                  </dd>
                  <dt id="R.1.3.4.1">Who</dt>
                  <dd>
                     <p id="p5">Everyone who shows up, of course! The featured speaker this time is Denise Miura who will explain Mark Logic's new cloud deployment options and describe how they are being used today within the Mark Logic development community. She will demonstrate instantiating a MarkLogic AMI live on Amazon EC2 and talk about best practices for using MarkLogic on the EC2 platform. Finally a preview of the planned cloud-related product enhancements will be provided. This is a great opportunity to provide
feedback and influence the cloud computing initiative at Mark Logic.</p>
                  </dd>
                  <dt id="R.1.3.5.1">How</dt>
                  <dd>
                     <p id="p6">If you plan to attend, please join the <a href="http://developer.marklogic.com/mailman/listinfo/nymug" shape="rect">mailing list</a> and send your first and last name to cleo dot saab at marklogic dot com.</p>
                  </dd>
               </dl>
            </div>
            <p id="p8">If you're in New York, please stop by (and please let Cleo know if you plan to stop by).</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>XProc: Back to Last Call</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/12/28/xproc-lc"/>
      <id>http://norman.walsh.name/2009/12/28/xproc-lc</id>
      <published>2009-12-28T14:20:52Z</published>
      <updated>2009-12-28T15:24:41Z</updated>
      <category term="w3c" scheme="http://technorati.com/tag/"/>
      <dc:subject>W3C</dc:subject>
      <category term="xproc" scheme="http://technorati.com/tag/"/>
      <dc:subject>XProc</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Early in January, a new XProc draft will appear. It will be a Last Call Working Draft, a step backwards in the process, or maybe just a half-step. The reason is important though: versioning.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/12/28/xproc-lc">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Early in January, a new XProc draft will appear. It will be a Last Call Working Draft, a step backwards in the process, or maybe just a half-step. The reason is important though: versioning.</p>
            </div>
            <p id="p1">The <a href="http://www.w3.org/XML/Processing/" shape="rect">XProc WG</a> has been making steady progress on <a href="http://www.w3.org/TR/xproc/" shape="rect">XProc: An XML Pipeline Language</a>. We saw the start of wide adoption of <a href="http://en.wikipedia.org/wiki/XML_pipeline"
                  title="Wikipedia: XML pipeline"
                  shape="rect">XProc</a>
               <a href="/knows/what/xproc" shape="rect">
                  <img border="0" alt="[L]" src="/graphics/linkgroup.gif"/>
               </a> in 2009 and I think there's every reason to expect more of the same in 2010.</p>
            <p id="p2">This makes it all the more disappointing to report that we're going back to <a href="http://www.w3.org/2005/10/Process-20051014/tr#last-call" shape="rect">Last Call</a>. On a personal note, as a <a href="http://en.wikipedia.org/wiki/Technical_Architecture_Group"
                  title="Wikipedia: Technical Architecture Group"
                  shape="rect">TAG</a> alum, it's a bit embarrassing to admit why: versioning.</p>
            <p id="p3">We received significant and persuasive criticism of our <a href="http://www.w3.org/TR/2009/CR-xproc-20090528/#versioning-considerations"
                  shape="rect">versioning story</a>. In particular, we were pesuaded that requiring a processor to download the declarations for <em>V.next</em> steps in order to process them in a “forwards compatible” manner was too burdensome.</p>
            <p id="p4">In redrafting the story, we added a <tt class="tag-attribute">version</tt> attribute, “compile-time” <tt class="tag-attribute">use-when</tt> functionality <a href="http://www.w3.org/TR/xslt20/#conditional-inclusion" shape="rect">à la XSLT</a>, and extension functions for more precisely identifying the environment in which the pipeline is running.</p>
            <p id="p5">We also took the opportunity to fix decisions that were, in retrospect, mistakes, but not in-and-of themselves sufficient to motivate us to return to last call: we changed the rules for connections in option, parameter, and variable bindings so that an explicit <tt class="tag-starttag">&lt;p:empty&gt;</tt> isn't required when there's no default readable port, and we changed the rules for parameter input ports so that a binding isn't required when there's at least one explicit <tt class="tag-starttag">&lt;p:with-param&gt;</tt>. Users are already thanking us.</p>
            <p id="p6">I didn't get the document through the publication process before the end-of-year publishing moratorium, but you can read <a href="http://www.w3.org/XML/XProc/docs/WD-xproc-20091222/" shape="rect">the staged draft</a>, if you wish.</p>
            <p id="p7">Will this really be our last Last Call? I sincerely hope so! I also hope that we can move directly from Last Call to Proposed Recommendation without the formality of another Candidate Recommendation phase in between. There's precedent for doing this, and we've got active implementors.</p>
            <p id="p8">I'm sometimes frustrated by how long the process takes, but I console myself with the observation that the language is better for our efforts. Rising usage suggests the early adopters, at least, agree with us.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>David Alfred Walsh</title>
      <link rel="alternate" type="text/html" href="http://norman.walsh.name/2009/12/26/dad"/>
      <id>http://norman.walsh.name/2009/12/26/dad</id>
      <published>2009-12-26T20:38:58Z</published>
      <updated>2009-12-31T14:40:33Z</updated>
      <dc:subject>People</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>9 June 1923 — 26 November 2009.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/12/26/dad">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>9 June 1923 — 26 November 2009.</p>
            </div>
            <div class="epigraph">
               <p id="p2">A man may by custom fortify himself against pain, shame, and suchlike accidents; but as to death, we can experience it but once, and are all apprentices when we come to it.</p>
               <div class="attribution">
                  <span class="mdash">—</span>
                  <span class="personname">
                     <span class="surname">Montaigne</span>
                  </span>
               </div>
            </div>
            <p id="p1">My father was born in 1923 in Babylon, NY.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 331px">
                     <a href="http://www.flickr.com/photos/ndw/4193489909/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2556/4193489909_9968b98dc2.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 141px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>David Walsh, age 10</h3>
               </div>
            </div>
            <p id="p3">He survived the <a href="http://en.wikipedia.org/wiki/Great%20Depression"
                  title="Wikipedia: Great Depression"
                  shape="rect">Great Depression</a>. An enormous tree blew over next to him as he walked home through <a href="http://en.wikipedia.org/wiki/New_England_Hurricane_of_1938"
                  title="Wikipedia: New England Hurricane of 1938"
                  shape="rect">The Great Hurricane of 1938</a>; he walked away without a scratch. The <a href="http://en.wikipedia.org/wiki/Glider_infantry"
                  title="Wikipedia: Glider infantry"
                  shape="rect">glider born infantry</a>
took him to the <a href="http://en.wikipedia.org/wiki/China_Burma_India_Theater_of_World_War_II"
                  title="Wikipedia: China Burma India Theater of World War II"
                  shape="rect">China-Burma-India</a> theater in WWII.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 315px">
                     <a href="http://www.flickr.com/photos/ndw/4193497853/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2678/4193497853_475bc45843.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 133px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Bombay c. 1945</h3>
                  <div class="description">
                     <p>Left to right: Charles Kuhn, York PA; Bill Bride, Beacon NY; Eddy Evans, Boston MA; David Walsh, Babylon, NY</p>
                  </div>
               </div>
            </div>
            <p id="p4">Shrapnel chipped a tooth, but he survived that too. After the war he went to Alaska.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4193493117/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2595/4193493117_a49982cafe.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Browerville from the tundra, Mar 1960</h3>
               </div>
            </div>
            <p id="p5">My dad taught in <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=barrow,+ak&amp;sll=64.501111,-165.406389&amp;sspn=32.580803,52.119141&amp;ie=UTF8&amp;hq=&amp;hnear=Barrow,+North+Slope,+Alaska&amp;ll=63.194018,-157.587891&amp;spn=34.054271,52.119141&amp;z=4"
                  shape="rect">Barrow</a> and <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=nome,+ak&amp;sll=64.997939,-155.478516&amp;sspn=32.028433,52.119141&amp;ie=UTF8&amp;hq=&amp;hnear=Nome,+Alaska&amp;ll=64.501111,-165.406389&amp;spn=32.580803,52.119141&amp;z=4"
                  shape="rect">Nome</a>. After putting out a chimney fire, he walked away from a two story fall off a frozen roof by the lucky stroke of landing feet-first on an oil drum.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4193491117/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2671/4193491117_bf16b02b6b.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Camping in Alaska, c. 1960</h3>
               </div>
            </div>
            <p id="p6">He single-handedly built a one-room cabin on a ¼ acre plot in Fairbanks. (I think I remember seeing once a photo showing the scaffolding he built to get the roof beam in place.) He worked for the <a href="http://en.wikipedia.org/wiki/United_States_Fish_and_Wildlife_Service"
                  title="Wikipedia: United States Fish and Wildlife Service"
                  shape="rect">Fish and Wildlife Service</a> in the summers.</p>
            <p id="p7">He used to practice orienteering by walking into the Alaskan wilderness on a compass bearing and then walking back out again. On one occasion he stumbled across a downed single-engine plane containing the skeleton of its pilot. His boss laughed when my dad offered to lead a team back to the crash, assuring him that he'd never find it again. Dad's boss was right. There is <em>a lot</em> of wilderness out there.</p>
            <p id="p8">On another occasion, my dad shot a caribou only to discover as he prepared to dress it that he'd left his knife back in the jeep. Leaning his rifle against a tree, he walked back and got his knife. An enormous brown bear greeted his return by standing on its hind legs and roaring. The bear got the caribou. And the rifle. And the knife, dropped during a hasty retreat.</p>
            <p id="p9">That wasn't the only caribou that nearly got him killed; on another occasion, one attempted, unsuccessfully, to jump over his jeep. He woke on the side of the road with a caribou hoof protruding into the cab and a nasty gash on his head.</p>
            <p id="p10">I'm lucky to be here.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 333px">
                     <a href="http://www.flickr.com/photos/ndw/4194262042/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2743/4194262042_47c8e66da1.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 142px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Sleeping, June 1970</h3>
               </div>
            </div>
            <p id="p11">When my dad left Alaska, he gave the keys to his cabin to a friend. Those keys passed from friend to friend for more than twenty years. In the eighties, the current occupant persuaded my dad to let him buy the cabin. My father signed the deed and mailed it, asking the occupant to please mail the check back. The check came back a couple of weeks later. And it cleared. Luck of the Irish, or something.</p>
            <p id="p12">From Alaska, my dad traveled to Australia. My mom and dad met in Tasmania. They married in 1961.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 357px">
                     <a href="http://www.flickr.com/photos/ndw/4194255572/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2678/4194255572_933c2385e9.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 154px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Mom and dad, August 1961</h3>
               </div>
            </div>
            <p id="p13">I came along a few years later.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4193489085/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2739/4193489085_feb699e381.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Mom, dad, and I</h3>
                  <div class="description">
                     <p>June 1968</p>
                  </div>
               </div>
            </div>
            <p id="p14">I remember my dad singing sea shanties when I was a small boy.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4194245458/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm5.static.flickr.com/4046/4194245458_5db392377f.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Playing guitar</h3>
                  <div class="description">
                     <p>Time and place unknown</p>
                  </div>
               </div>
            </div>
            <p id="p15">Dad was a naturalist, hunter, trapper, fisherman, scientist, teacher, draftsman, and surveyor. He made beautiful wood carvings. He tied knots. At one time or another, <a href="http://en.wikipedia.org/wiki/List_of_knots"
                  title="Wikipedia: List of knots"
                  shape="rect">all of them</a>. I have his leather working tools. The old sewing machine on which he made sleeping bags, tents, parkas, rain slickers, and bicycle paniers got lost somewhere along the way. He built two boats.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 335px">
                     <a href="http://www.flickr.com/photos/ndw/4194244816/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2517/4194244816_68b9aa6e45.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 128px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a> 
                     <a href="http://maps.google.com/maps?ll=52.8406583333333,1.25690833333333&amp;z=16&amp;t=k"
                        shape="rect">
                        <img border="0" alt="[Google maps]" src="/graphics/map.png"/>
                     </a>
                  </div>
                  <h3>David Walsh, Sep 2009</h3>
               </div>
            </div>
            <p id="p16">After 86 years, entropy won. Entropy always wins. My dad taught me that. And the first and third <a href="http://en.wikipedia.org/wiki/Laws_of_thermodynamics"
                  title="Wikipedia: Laws of thermodynamics"
                  shape="rect">laws</a> as well.</p>
            <p id="p17">My father died in 2009 in Norwich, England.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4155535952/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2766/4155535952_b95b328e86.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>'tis himself</h3>
                  <div class="description">
                     <p>David Alfred Walsh 9 June 1923 - 26 November 2009</p>
                  </div>
               </div>
            </div>
            <p id="p18">Goodbye, dad.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>NYMUG Summary</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/11/12/nymug"/>
      <id>http://norman.walsh.name/2009/11/12/nymug</id>
      <published>2009-11-12T16:16:03Z</published>
      <updated>2009-11-12T21:56:04Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Last night, I spoke at the inaugural New York Mark Logic User Group meeting. I think it was a crowd pleaser, or at least, the punchline at the end was.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/11/12/nymug">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Last night, I spoke at the inaugural New York Mark Logic User Group meeting. I think it was a crowd pleaser, or at least, the punchline at the end was.</p>
            </div>
            <p id="p1">The real purpose of a user group is to bring together <em>users</em> (and prospective users). There was much lively discussion after my presentation, which I won't attempt to recapitulate here. The next NYMUG meeting will (most likely) be sometime in January, so please plan to come. I'll post more concrete details here when they're available, and we'll send them to the <a href="http://developer.marklogic.com/mailman/listinfo/nymug" shape="rect">mailing list</a>, of course.</p>
            <p id="p2">My biggest problem in preparing for speaking events is figuring out what to talk about, and then figuring out how to stretch that topic to fit in the time allotted.</p>
            <p id="p3">When I was suggested as the speaker for our inaugural New York meeting, I had to figure out what to talk about. I quickly thought of a topic, but never imagined that it'd be possible. To my delight and surprise, when I shopped the idea around engineering and marketing, there was universal support for the idea. So after I picked my jaw up off the floor, I had to turn my attention to stretching the topic.</p>
            <p id="p4">The topic I had in mind would easily fit on a couple of slides. That would make for a pretty short talk. At a user group meeting, maybe that wouldn't be all bad, but I felt I needed to say a bit more. I stretched things out by talking about three new and cool features that I thought some folks might not have seen or used yet.</p>
            <div class="section">
               <h2 class="runin">Support for https: and URI rewriting </h2>
               <p class="runin" id="p5">
                  <a id="https" name="https" shape="rect"/>Support for <tt class="systemitem">https:</tt> is pretty self-explanatory. Lots of sites have private information (user profiles and passwords, for example) that <em>should not</em> (many would say “must not”) be sent over an insecure communications channel. Furthermore, most users have been trained to look for a secure connection before sending credit card or other financial information over the web.</p>
               <p id="p6">Support for URI rewriting allows application authors to make cleaner interfaces. I'm a fan of clean URI interfaces. Call me picky, but think it's a lot better to expose the 4th slide in your presentation as <tt class="uri">http://example.com/slides/nymug/4</tt> than as <tt class="uri">http://example.com/slides.xqy?deck=nymug.xml&amp;foil=4&amp;format=html</tt>.</p>
               <p id="p7">Until recently, if you wanted to do this with a web site built on top of MarkLogic Server, you had to put up a proxy of some sort (often an Apache web server) to provide <tt class="systemitem">https:</tt> support and URI rewriting.</p>
               <p id="p8">Starting with MarkLogic Server V4.1, this is no longer necessary. MarkLogic server now supports https (with your own certificate or a generated, self-signed one) out of the box. The server also supports URI rewriting by allowing you to designate an arbitrary query module to rewrite URIs. Here's the example I used in the presentation:</p>
               <div class="programlisting">
                  <pre xml:space="preserve">
xquery version "1.0-ml";

declare variable $url as xs:string
        := xdmp:get-request-url();

if (matches($url, "^/slides/([^/]+)/([0-9]+)$"))
then
  replace($url,
          "^/slides/([^/]+)/([0-9]+)$",
          "/slides.xqy?deck=$1.xml&amp;amp;foil=$2")
else
  $url
</pre>
               </div>
               <p id="p9">It's an incomplete, toy example taken from the real code I used on the server behind my presentation, but it gives you a flavor for how it works. Your module starts with the URL that that was used (and access to the headers and other parts of the request), performs any sort of computation that you'd like, and returns the new URI. The new URI then goes into the server and is processed normally.</p>
               <p id="p10">There may still be good reasons to put a proxy in front of MarkLogic Server (load balancing, etc.), but you no longer have to just to satisfy requests for these two common and simple features.</p>
            </div>
            <div class="section">
               <h2 class="runin">Office Toolkits </h2>
               <p class="runin" id="p11">
                  <a id="toolkits" name="toolkits" shape="rect"/>For reasons that will become clear later on (and not only because I find “office applications” to be an inefficient, frustrating, pointless time-suck), I wanted to present this presentation using ordinary web technologies. (Specifically, HTML+CSS+JavaScript served up by MarkLogic Server.)</p>
               <p id="p12">At the same time, because I was going to talk about new server features, I was required to present a disclaimer:</p>
               <div class="admonition" id="disclaimer">
                  <table border="0" cellspacing="0" cellpadding="4"
                         summary="Presentation of a admonition">
                     <tbody>
                        <tr>
                           <td valign="top" rowspan="1" colspan="1">
                              <span class="admon-graphic">
                                 <img alt="Important" src="/graphics/important.png"/>
                              </span>
                           </td>
                           <td rowspan="1" colspan="1">
                              <div class="admon-title-text">Disclaimer</div>
                              <div class="admon-text">
                                 <p id="p13">All statements describing future releases, estimated release dates and content are plans only, and Mark Logic is under no obligation to develop, include or make available, commercially or otherwise, any specific feature or functionality in any Mark Logic product.</p>
                                 <p id="p14">Information is provided for general understanding and informational purposes only, and is subject to change at the sole discretion of Mark Logic in response to changing customer requirements, market conditions, delivery schedules and other factors.</p>
                              </div>
                           </td>
                        </tr>
                     </tbody>
                  </table>
               </div>
               <p id="p15">(A disclaimer that applies as much to this weblog essay as it did to my presentation last night, I might add.)</p>
               <p id="p16">Trouble is, this slide was sent to me in Powerpoint. To use it, I'd have to switch to Powerpoint for the rest of my presentation (yuck!), copy and paste the text (where's the fun in that?), or find some way to incorporate the slide into my DocBook-based slide deck (now that sounds interesting).</p>
               <p id="p17">Luckily, one of our engineers, <span class="personname">
                     <span class="firstname">Pete</span> 
                     <span class="surname">Aven</span>
                  </span> has already done all the heavy lifting. Pete's the primary developer of Mark Logic's open source toolkits for Word, Excel, and Powerpoint. Each toolkit provides a pipeline for ingesting office documents into MarkLogic server, an office plugin for using the server from the application, and some XQuery code to work wtih the files in the server.</p>
               <p id="p18">With that framework in place, it was pretty easy to write a little bit of XQuery code that would extract a slide from a deck and transform it into DocBook. (That's about 20 lines of code, nothing fancy, it extracts paragraphs and bulleted lists from slides, no more, no less.)</p>
               <p id="p19">The source for my final presentation looks like this:</p>
               <div class="programlisting">
                  <pre xml:space="preserve">
&lt;slides xmlns="http://docbook.org.ns/docbook"&gt;
  &lt;info&gt;
    &lt;title&gt;Transforming XML Development &lt;?lb?&gt;with MarkLogic&lt;/title&gt;
    …
  &lt;/info&gt;
  &lt;foilgroup&gt;

    &lt;foil&gt;
      &lt;title&gt;NYMUG!&lt;/title&gt;
      …
    &lt;/foil&gt;

    &lt;foil pptx="/pptx/disclaimer" foil="1"/&gt;

    …
  &lt;/foilgroup&gt;
&lt;/slides&gt;
</pre>
               </div>
               <p id="p20">a straightforward mixture of hand-authored slides and references to slides from a couple of Powerpoint decks. For the presentation, I edited a Powerpoint slide and ran it back through the process in real time, but that doesn't translate very well to a weblog essay.</p>
            </div>
            <div class="section">
               <h2 class="runin">Presentation </h2>
               <p class="runin" id="p21">
                  <a id="presentation" name="presentation" shape="rect"/>That left just the last part of my talk, “the big reveal” as it were. Having set this all up so that I can author in DocBook, even including Powerpoint slides, serve it up over <tt class="systemitem">https:</tt>, and use nice looking URIs, I still have to go the last mile and get the content into the browser.</p>
               <p id="p22">A couple of obvious ways present themselves. I could serve it up as XML with a stylesheet and let the browser do the work. Could do, but I didn't. I could translate the DocBook markup into (X)HTML using XQuery in MarkLogic Server. Could do, but I didn't.</p>
               <p id="p23">What I really want, but haven't been able to do, is to transform the DocBook in MarkLogic Server with XSLT. And <tt class="tag-starttag">&lt;cue&gt;</tt>drum roll<tt class="tag-endtag">&lt;/cue&gt;</tt> … I can haz! <tt class="tag-starttag">&lt;cue&gt;</tt>cymbal crash<tt class="tag-endtag">&lt;/cue&gt;</tt>
               </p>
               <div class="programlisting">
                  <pre xml:space="preserve">
let $doc := xdmp:document-get(concat($ROOT, $xml))/*
let $expanded := local:expand-powerpoint($doc)
let $map := map:map()
let $put := map:put($map, ...("dbp:foil")), $foil)
let $put := map:put($map, ...("dbp:deck")), $deck)
let $put := map:put($map, ...(xs:QName("dbp:format")),
                    $format)
return
  xdmp:xslt-invoke($xslt, $expanded, $map)
</pre>
               </div>
               <p id="p24">Running an internal build, I can demonstrate support for XSLT 2.0 in MarkLogic Server. (Go back and read that disclaimer again now.)</p>
               <p id="p25">There was some rejoicing at the user group meeting, I hope there's some rejoicing out there over the intertubes too. I'm certainly giddy with delight over the prospects of high-performance XSLT processing in MarkLogic Server V.future.</p>
               <p id="p26">What more can I say. When you've done your best trick, it's time to get off the stage.</p>
               <p id="p27">Thanks again to <span class="personname">
                     <span class="firstname">Steve</span> 
                     <span class="surname">Kotrch</span>
                  </span> and Simon &amp; Schuster for hosting. Hope to see you all in January!</p>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>NYMUG: New York Mark Logic Users Group!</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/11/04/nymug"/>
      <id>http://norman.walsh.name/2009/11/04/nymug</id>
      <published>2009-11-04T19:32:47Z</published>
      <updated>2009-11-05T20:49:36Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>The inaugural meeting of the New York Mark Logic User Group will take place on Wednesday evening, 11 November 2009.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/11/04/nymug">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>The inaugural meeting of the New York Mark Logic User Group will take place on Wednesday evening, 11 November 2009.</p>
            </div>
            <p id="p1">Please come join me at the inaugural meeting of the Mark Logic New York User Group on Wednesday evening, 11 November 2009, hosted by <span class="personname">
                  <span class="firstname">Steve</span> 
                  <span class="surname">Kotrch</span>
               </span> from Simon &amp; Schuster!</p>
            <div class="variablelist">
               <dl>
                  <dt>What</dt>
                  <dd>
                     <p id="p2">An opportunity to learn more about <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> and collaborate with other MarkLogic users. Pizza and soft drinks will be provided and there will be drawings for prizes!</p>
                  </dd>
                  <dt>When</dt>
                  <dd>
                     <p id="p3">Wednesday, 11 November 2009, at 6:30pm EST.</p>
                  </dd>
                  <dt>Where</dt>
                  <dd>
                     <p id="p4">
                        <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=1230+Avenue+of+the+Americas,+New+York,+NY&amp;sll=37.0625,-95.677068&amp;sspn=54.005807,51.416016&amp;ie=UTF8&amp;hq=&amp;hnear=1230+Avenue+of+the+Americas,+New+York,+10020&amp;z=17"
                           shape="rect">1230 Avenue of the Americas</a>, New York, NY between 48th and 49th streets on Sixth Avenue.</p>
                  </dd>
                  <dt>Who</dt>
                  <dd>
                     <p id="p5">Everyone who shows up, of course! For better or worse, I'm the headline speaker, if that's what you meant.</p>
                  </dd>
                  <dt>How</dt>
                  <dd>
                     <p id="p6">If you plan to attend, please join the <a href="http://developer.marklogic.com/mailman/listinfo/nymug" shape="rect">mailing list</a> and send your first and last name to cleo dot saab at marklogic dot com.</p>
                  </dd>
               </dl>
            </div>
            <p id="p7">I'd say more about what I'm going to say if I'd figured out more of what I'm going to say. The title of my presentation is <em class="citetitle">Transforming XML Development with MarkLogic</em>. I think it'll be a fairly interactive show. Without any spoilers, I'll admit to having some DocBook, <a href="http://norman.walsh.name/2009/11/04/docbook50" title="DocBook V5.0"
                  shape="rect">5.0 of course</a>, some Powerpoint, some HTML, and I'm pulling them together in some pretty damn cool ways, if I do say so
myself.</p>
            <p id="p8">If you're in New York, please stop by (and please let Cleo know if you plan to stop by).</p>
            <p id="p9">See you there!</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>DocBook V5.0</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/11/04/docbook50"/>
      <id>http://norman.walsh.name/2009/11/04/docbook50</id>
      <published>2009-11-04T18:55:29Z</published>
      <updated>2009-11-04T19:27:13Z</updated>
      <category term="docbook" scheme="http://technorati.com/tag/"/>
      <dc:subject>DocBook</dc:subject>
      <category term="oasis" scheme="http://technorati.com/tag/"/>
      <dc:subject>OASIS</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>DocBook V5.0 is an OASIS Standard!</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/11/04/docbook50">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>DocBook V5.0 is an OASIS Standard!</p>
            </div>
            <p id="p1">I <a href="http://norman.walsh.name/2009/10/19/docbook50"
                  title="Call for Vote - DocBook V5.0"
                  shape="rect">reported</a> a couple of weeks ago that the ballot to make DocBook V5.0 an OASIS Standard was open. I can now report that the ballot is closed, it closed on 31 October.</p>
            <p id="p2">To my great satisfaction, I can also report that we easily cleared the <a href="http://www.oasis-open.org/committees/process-2009-07-30.php#OASISstandard"
                  shape="rect">process hurdles</a>.</p>
            <p id="p3">DocBook V5.0 is <a href="http://lists.oasis-open.org/archives/docbook-tc/200911/msg00001.html"
                  shape="rect">officially</a> an OASIS Standard.</p>
            <p id="p4">I'd like to thank everyone who took the time to help us get here: users, reviewers, members of the Technical Committee (past and present), and the OASIS members who voted for us, of course.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Evernote</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/11/01/evernote"/>
      <id>http://norman.walsh.name/2009/11/01/evernote</id>
      <published>2009-11-01T21:51:13Z</published>
      <updated>2009-11-02T13:58:39Z</updated>
      <category term="evernote" scheme="http://technorati.com/tag/"/>
      <dc:subject>Evernote</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>With a scanner and some Python, I'm an enthusiastic convert to Evernote.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/11/01/evernote">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>With a scanner and some Python, I'm an enthusiastic convert to Evernote.</p>
            </div>
            <p id="p1">I first tried <a href="http://www.evernote.com/" shape="rect">Evernote</a> more than a year ago. It seemed interesting, especially the ability to extract text from uploaded notes, even handwriting. It's not performing traditional OCR, but it does build a list of likely words in each note. That's enough to make search quite useful.</p>
            <p id="p2">Here we see a match for “America” in my barely legible scrawl captured by <a href="http://www.jotnot.com/" shape="rect">JotNot</a>:</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4067864012/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2631/4067864012_e9601362f9_d.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Evernote search</h3>
               </div>
            </div>
            <p id="p3">Even though this all seems pretty cool, I never really used it very much. Part of the problem was that I didn't seem to be putting very much into it. If you've only got six notes, you don't need folders or tagging or clever search to find them.</p>
            <p id="p4">A few weeks ago, <span class="personname">
                  <span class="firstname">Michael</span> 
                  <span class="surname">Mealling</span>
               </span> 
               <a href="http://twitter.com/mmealling/status/4844492645" shape="rect">pointed out</a> that one way to fix the data problem was to scan documents directly into Evernote. <em>That</em> turned out to be an <em>excellent</em> suggestion.</p>
            <p id="p5">For example, I've had a manilla folder of articles torn from magazines in a drawer in my desk for ages. The threshold for getting torn out and shoved in that folder was pretty high because, let's be honest, if I put too many things in that folder, I'll neither remember what's there nor be able to find it if I do remember.</p>
            <p id="p6">Twenty minutes with a scanner and suddenly I've got something that I can tag, categorize, and search painlessly. What's more it's something I <em>have with me</em> on my laptop or my phone or anywhere I can get to a web browser: it's not in a folder in a drawer thousands of miles away (as my desk is this week, for example).</p>
            <p id="p7">Of course, then I had a <em>much bigger</em> problem. I am very, very reluctant to put my data in your application if I can't get it back again. Repeat after me: no roach motels. I wish Evernote every success in the world (I'm even doing my part, to the tune of $45/year, to keep them around), but it's not hard to imagine a future in which I've come to rely on them as a repository of important information only to discover some Thursday afternoon that they've gone away or <a href="http://www.wired.com/epicenter/2009/01/magnolia-suffer/" shape="rect">lost all my data</a> or otherwise left me high and dry.</p>
            <p id="p8">To their credit and my relief, they have an API. It's not the sort of RESTful Web API I've come to expect from these sorts of services, but that's ok. It's published and documented and claims to support Java, Perl, PHP, Python and Ruby out of the box.</p>
            <p id="p9">A few short hours of hacking and a few hundred lines of Python and I had a backup tool that gets back everything I put into Evernote <em>and</em> an XML representation of the search data. What more could I ask for?</p>
            <p id="p10">(If you're interested in trying <a href="examples/backup-evernote.py" shape="rect">my backup script</a>, you'll need to go through Evernote support to get your own API key and configure a few lines to indicate where you want the files stored, but after that it should work for you too. YMMV, of course.)</p>
            <p id="p11">My script creates an XML representation (natch!) of the details returned by the Evernote APIs. Eventually, I'll probably decide to fix things so it doesn't create a single potentially enormous file, but it doesn't much matter to me at the moment because I turn around and drop this into the <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> instance that I use for my local PIM data.</p>
            <p id="p12">Now, instead of collecting only the <em>very most</em> interesting articles I see and losing them in a folder, I collect anything that interests me even slightly, confident in the knowledge that I'll always be able to find it, and everything else, with ease. Alongside those articles, you'll find scanned business cards, recipes, specifications, notes, photographs of whiteboards and napkins, receipts, and a host of other interesting words and ideas that I've captured as the universe pushed them
past me.</p>
            <p id="p13">Pretty sweet.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Call for Vote - DocBook V5.0</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/10/19/docbook50"/>
      <id>http://norman.walsh.name/2009/10/19/docbook50</id>
      <published>2009-10-19T19:02:35Z</published>
      <updated>2009-10-19T19:15:32Z</updated>
      <category term="docbook" scheme="http://technorati.com/tag/"/>
      <dc:subject>DocBook</dc:subject>
      <category term="oasis" scheme="http://technorati.com/tag/"/>
      <dc:subject>OASIS</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>DocBook V5.0 is ready to become an OASIS Standard!</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/10/19/docbook50">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>DocBook V5.0 is ready to become an OASIS Standard!</p>
            </div>
            <p id="p1">It's been a long, and sometimes winding, path from DocBook V4.5 to DocBook V5.0, but we're finally approaching the finish line to make it official. The OASIS <a href="http://lists.oasis-open.org/archives/docbook-tc/200910/msg00009.html"
                  shape="rect">Call for Votes</a> went out on Friday!</p>
            <p id="p2">The <a href="http://www.oasis-open.org/apps/org/workgroup/voting/ballot.php?id=1785"
                  shape="rect">ballot</a> is open now through 31 October, 2009.</p>
            <p id="p3">If you belong to an <a href="http://www.oasis-open.org/about/foundational_sponsors.php" shape="rect">OASIS Member</a> organization, please encourage your representative to vote “Yes”. (The full list of eligible voting members is included on the ballot).</p>
            <p id="p4">DocBook V5.0 is a significant milestone in the evolution of the standard. Based on RELAX NG and designed with extensibility and flexibility in mind, making DocBook V5.0 an OASIS Standard will provide the solid foundation that we need to continue improving <em>The Source for Documentation</em>™.</p>
            <p id="p5">(We really do have some cool stuff coming down the pipe, so as they say in Chicago, vote early and vote often!)</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Micro-blogging Backup, part the fifth</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/10/18/mbb05"/>
      <id>http://norman.walsh.name/2009/10/18/mbb05</id>
      <published>2009-10-18T20:14:49Z</published>
      <updated>2009-10-19T15:47:02Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <category term="microblogging" scheme="http://technorati.com/tag/"/>
      <dc:subject>Microblogging</dc:subject>
      <category term="www" scheme="http://technorati.com/tag/"/>
      <dc:subject>TheWeb</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>In which we clean things up.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/10/18/mbb05">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>In which we clean things up.</p>
            </div>
            <p id="p1">If you've been using one of the micro-blogging services for a while, you're probably familiar with a set of conventions that have evolved for adding metadata to your status messages. The ones I'm familiar with are:</p>
            <div class="itemizedlist">
               <ul>
                  <li>
                     <p id="p2">
                        <em>@user</em> to identify another user.</p>
                  </li>
                  <li>
                     <p id="p3">
                        <em>#tag</em> to add a “tag” to your message.</p>
                  </li>
                  <li>
                     <p id="p4">
                        <em>!group</em> to identify a group (at the moment, this seems only to be an <a href="http://identi.ca/" shape="rect">Identi.ca</a> convention).</p>
                  </li>
               </ul>
            </div>
            <p id="p5">In addition to those conventions, the use of “URL shorteners” (<a href="http://tinyurl.com/" shape="rect">http://tinyurl.com/</a>, <a href="http://bit.ly/" shape="rect">http://bit.ly/</a>, <a href="http://is.gd/" shape="rect">http://is.gd/</a>, etc.) is common. And, finally, although it may not be apparent in the client you use, at the API level, individual status messages may indicate that they are “in-reply-to” some other message.</p>
            <p id="p6">So far, our micro-blogging backup system doesn't take advantage of any of this extra information.</p>
            <p id="p7">One of the first things I decided to do was expand shortened URIs. There's no 140 character limit in the database and if you link to something on <a href="http://youtube.com/" shape="rect">http://youtube.com/</a>, the odds that I want to follow that link are within <a href="http://en.wikipedia.org/wiki/Limit_%2528mathematics%2529"
                  title="Wikipedia: Limit (mathematics)"
                  shape="rect">ε</a> of zero. I'd like to know before I click.</p>
            <p id="p8">As long as we're grovelling through the text of each message, it makes sense to expand the other conventions, turning them into the appropriate links.</p>
            <p id="p9">It also makes sense to download any messages that an existing message is “in-reply-to”. If those messages are also replies, we'll follow them too until the trail ends. This allows us to display whole conversations, even if they involve participants that we don't follow.</p>
            <p id="p10">All of this can be accomplished with one new module, <a href="examples/cleanup.xqy" shape="rect">/modules/cleanup.xqy</a>, and a new top-level query to drive it, <a href="examples/clean-tweets.xqy" shape="rect">/clean-tweets.xqy</a>. The interesting bits are in the <tt class="filename">cleanup.xqy</tt> module:</p>
            <div class="orderedlist">
               <ol style="list-style: decimal;">
                  <li>
                     <p id="p11">The actual work is just string manipulation: regular expressions and tokenize, mostly.</p>
                  </li>
                  <li>
                     <p id="p12">Following replies counts against your rate-limit, so we do at most 50 at a time.</p>
                  </li>
                  <li>
                     <p id="p13">To expand URIs, we perform HTTP HEAD requests against the URIs we find in the status messages. In the worst case, some of those may timeout, so we do at most 500 at a time. That way we're unlikely to perform a query that takes so long that <em>it</em> times out.</p>
                  </li>
                  <li>
                     <p id="p14">If you look closely, you'll see that in addition to doing the expansions, we also add new elements to the status document: <tt class="tag-starttag">&lt;t:mention&gt;</tt> for mentions of another user, <tt class="tag-starttag">&lt;t:tag&gt;</tt> for tags, <tt class="tag-starttag">&lt;t:group&gt;</tt> for groups, and <tt class="tag-starttag">&lt;t:host&gt;</tt> for the host names of expanded URIs.</p>
                     <p id="p15">We'll come back in some future installment and use those for faceted searches (e.g., “find all the messages by <tt class="literal">@xmlcalabash</tt> that include links to <tt class="literal">tests.xproc.org</tt>”).</p>
                  </li>
               </ol>
            </div>
            <p id="p16">Pop the two files mentioned above into your setup (if this is your first encounter with my micro-blogging backup series, make sure you start at <a href="http://norman.walsh.name/2009/08/27/mbb01"
                  title="Micro-blogging Backup, part the first"
                  shape="rect">the beginning</a>).</p>
            <p id="p17">After you've installed the files, running <a href="http://localhost:8330/clean-tweets.xqy" shape="rect">http://localhost:8330/clean-tweets.xqy</a> will start cleaning up your database. If you've downloaded a lot of messages, you'll have to run it several times. If you have a lot of replies, you'll have to spread the runs over a few hours.</p>
            <p id="p18">The fact that you sometimes have to run the cleanup scripts several times is a bit inconvenient. I'm experimenting with some JavaScript to improve that, but I'm still looking for better solutions.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Built my own...</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/10/15/builtMyOwn"/>
      <id>http://norman.walsh.name/2009/10/15/builtMyOwn</id>
      <published>2009-10-16T01:31:15Z</published>
      <updated>2009-10-16T02:18:12Z</updated>
      <category term="linux" scheme="http://technorati.com/tag/"/>
      <dc:subject>Linux</dc:subject>
      <dc:subject>SelfReference</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Another geekdom right of passage: builing my own box.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/10/15/builtMyOwn">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Another geekdom right of passage: builing my own box.</p>
            </div>
            <p id="p1">As I noted <a href="http://norman.walsh.name/2009/06/03/buildingMyOwn"
                  title="Building my own…"
                  shape="rect">back in June</a>, I've always wanted to build my own box. Now I have.</p>
            <p id="p2">I don't, in retrospect, claim to have learned what “the right answer” is, but here's the answer I arrived at:</p>
            <div class="itemizedlist">
               <ul>
                  <li>
                     <p id="p3">AMD Phenom II X4 955 Black Edition CPU</p>
                     <div class="artwork">
                        <div class="flickr-photo">
                           <div class="photo" style="width: 500px">
                              <a href="http://www.flickr.com/photos/ndw/4014549981/" shape="rect">
                                 <img border="0" alt="[Photo]"
                                      src="http://farm3.static.flickr.com/2425/4014549981_1949998479.jpg"/>
                              </a>
                           </div>
                           <div class="link" style="left: 225px;">
                              <a href="http://www.flickr.com/" shape="rect">
                                 <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                              </a>
                           </div>
                           <h3>AMD Phenom II X4 Processor Black Edition</h3>
                        </div>
                     </div>
                     <p id="p4">I was more than a little amused at the discrepancy between the size of the CPU and the size of the aluminum and copper monstrosity that sits on top of it to keep it cool:</p>
                     <div class="artwork">
                        <div class="flickr-photo">
                           <div class="photo" style="width: 500px">
                              <a href="http://www.flickr.com/photos/ndw/4014551791/" shape="rect">
                                 <img border="0" alt="[Photo]"
                                      src="http://farm4.static.flickr.com/3492/4014551791_6f26f86fc8.jpg"/>
                              </a>
                           </div>
                           <div class="link" style="left: 225px;">
                              <a href="http://www.flickr.com/" shape="rect">
                                 <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                              </a>
                           </div>
                           <h3>Cooling apparatus for the CPU</h3>
                        </div>
                     </div>
                  </li>
                  <li>
                     <p id="p5">ASUS M4A78T-E AM3 790GX HDMI ATX AMD motherboard</p>
                     <div class="artwork">
                        <div class="flickr-photo">
                           <div class="photo" style="width: 500px">
                              <a href="http://www.flickr.com/photos/ndw/4014572189/" shape="rect">
                                 <img border="0" alt="[Photo]"
                                      src="http://farm4.static.flickr.com/3512/4014572189_0ecfc8aecf.jpg"/>
                              </a>
                           </div>
                           <div class="link" style="left: 225px;">
                              <a href="http://www.flickr.com/" shape="rect">
                                 <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                              </a>
                           </div>
                           <h3>Motherboard in place</h3>
                        </div>
                     </div>
                  </li>
                  <li>
                     <p id="p6">4 x G.Skill 2GB 240-pin DDR3 1600 memory</p>
                     <div class="artwork">
                        <div class="flickr-photo">
                           <div class="photo" style="width: 500px">
                              <a href="http://www.flickr.com/photos/ndw/4015324442/" shape="rect">
                                 <img border="0" alt="[Photo]"
                                      src="http://farm3.static.flickr.com/2528/4015324442_a11180f982.jpg"/>
                              </a>
                           </div>
                           <div class="link" style="left: 225px;">
                              <a href="http://www.flickr.com/" shape="rect">
                                 <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                              </a>
                           </div>
                           <h3>Four of 8Gb</h3>
                        </div>
                     </div>
                  </li>
                  <li>
                     <p id="p7">4 x WD Caviar Green 1TB SATA hard drives</p>
                     <div class="artwork">
                        <div class="flickr-photo">
                           <div class="photo" style="width: 375px">
                              <a href="http://www.flickr.com/photos/ndw/4014565113/" shape="rect">
                                 <img border="0" alt="[Photo]"
                                      src="http://farm4.static.flickr.com/3479/4014565113_856bee4655.jpg"/>
                              </a>
                           </div>
                           <div class="link" style="left: 163px;">
                              <a href="http://www.flickr.com/" shape="rect">
                                 <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                              </a>
                           </div>
                           <h3>Gobs of space</h3>
                        </div>
                     </div>
                  </li>
                  <li>
                     <p id="p8">LG Black 8x BD-ROM 16x DVD-ROM 40x CD-ROM SATA LightScribe burner</p>
                     <div class="artwork">
                        <div class="flickr-photo">
                           <div class="photo" style="width: 500px">
                              <a href="http://www.flickr.com/photos/ndw/4014568137/" shape="rect">
                                 <img border="0" alt="[Photo]"
                                      src="http://farm3.static.flickr.com/2478/4014568137_a5bc32ac25.jpg"/>
                              </a>
                           </div>
                           <div class="link" style="left: 225px;">
                              <a href="http://www.flickr.com/" shape="rect">
                                 <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                              </a>
                           </div>
                           <h3>Optical drive</h3>
                        </div>
                     </div>
                  </li>
                  <li>
                     <p id="p9">APEVIA ATX-AS680W-BL 680W ATX12V/EPS12V SLI power supply</p>
                  </li>
                  <li>
                     <p id="p10">XCLIO Windtunnel ATX Full Tower case</p>
                     <div class="artwork">
                        <div class="flickr-photo">
                           <div class="photo" style="width: 500px">
                              <a href="http://www.flickr.com/photos/ndw/4015325928/" shape="rect">
                                 <img border="0" alt="[Photo]"
                                      src="http://farm3.static.flickr.com/2477/4015325928_3b1656f55d.jpg"/>
                              </a>
                           </div>
                           <div class="link" style="left: 225px;">
                              <a href="http://www.flickr.com/" shape="rect">
                                 <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                              </a>
                           </div>
                           <h3>Case with power supply in place</h3>
                        </div>
                     </div>
                  </li>
               </ul>
            </div>
            <p id="p11">I was pleasantly surprised how easy it was to assemble. The documentation was clear and complete, the parts all well labelled; it went together without a hitch.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4015338008/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2483/4015338008_691a78251e.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>All systems go!</h3>
               </div>
            </div>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/4014575611/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2475/4014575611_c98205a670.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Glow in the dark</h3>
               </div>
            </div>
            <p id="p12">(For a few more pictures, see <a href="http://www.flickr.com/photos/ndw/sets/72157622593140610/" shape="rect">Custom Build 2009</a> on <a href="http://www.flickr.com/" shape="rect">Flickr</a>.)</p>
            <p id="p13">I decided to run bleeding-edge <a href="http://en.wikipedia.org/wiki/Ubuntu_%28operating_system%29"
                  title="Wikipedia: Ubuntu (operating system)"
                  shape="rect">Ubuntu</a> 9.10 Server beta on it. It's not that I wouldn't like to try <a href="http://en.wikipedia.org/wiki/Solaris_%28operating_system%29"
                  title="Wikipedia: Solaris (operating system)"
                  shape="rect">Solaris</a> and <a href="http://en.wikipedia.org/wiki/ZFS" title="Wikipedia: ZFS" shape="rect">ZFS</a>, but…the days are short and the list of projects is long. Installing Ubuntu
is just that little bit easier. Ubuntu 9.10 installed flawlessly.</p>
            <p id="p14">The ASUS motherboard includes a video controller and assorted other controllers. It claims to do <a href="http://en.wikipedia.org/wiki/RAID" title="Wikipedia: RAID" shape="rect">RAID</a>, but a little investigation reveals that it's <a href="http://en.wikipedia.org/wiki/RAID%23Firmware.2Fdriver-based_RAID"
                  title="Wikipedia: RAID#Firmware.2Fdriver-based RAID"
                  shape="rect">fakeraid</a> so I abandoned it. Instead, I setup software RAID. The first disk is the boot disk, the remaining three provide 2TB of disk in a RAID5
configuration.</p>
            <p id="p15">The case, for all those monstrous fans, is pretty quiet, but not silent. I still think the server might get relocated to the basement, except that I worry about the higher humidity.</p>
            <p id="p16">Anyway, <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> is up-and-running and as soon as I get my <a href="http://norman.walsh.name/2009/08/27/mbb01"
                  title="Micro-blogging Backup, part the first"
                  shape="rect">Micro-blogging backup</a> configuration ported over, I promise I'll write that next installment…</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>XML Calabash 0.9.15</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/10/05/xmlcalabash"/>
      <id>http://norman.walsh.name/2009/10/05/xmlcalabash</id>
      <published>2009-10-06T01:04:11Z</published>
      <updated>2009-10-06T13:19:03Z</updated>
      <category term="calabash" scheme="http://technorati.com/tag/"/>
      <dc:subject>Calabash</dc:subject>
      <category term="java" scheme="http://technorati.com/tag/"/>
      <dc:subject>Java</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>A new release at last. New features, fewer bugs, and test suite clean again.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/10/05/xmlcalabash">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>A new release at last. New features, fewer bugs, and test suite clean again.</p>
            </div>
            <p id="p1">Speaking <a href="dominicanrepublic#p1" shape="rect">of work</a> (letting the non sequitors pile up), when I wasn't on the beach, I was hacking <a href="http://xmlcalabash.com/" shape="rect">XML Calabash</a>. I always update the <a href="http://norman.walsh.name/2008/projects/calabash"
                  title="XML Calabash: an XProc implementation"
                  shape="rect">project status page</a>, but in case that's too subtle, perhaps this essay will catch your attention. (You're reading this, so I guess that should be “caught” without the “perhaps”, but
nevermind.)</p>
            <p id="p2">Next in the task queue: supporting Saxon 9.2. I fear this is going to be a bit of a b*tch as I reach down into the guts of Saxon in a few places. Fair warning: I don't plan to attempt to support previous versions of Saxon after I make this switch.</p>
            <div class="section">
               <h2 class="runin">XML Calabash Logging </h2>
               <p class="runin" id="p3">
                  <a id="logging" name="logging" shape="rect"/>One of the changes in 0.9.15 is the switch from my own crufty, home grown logging infrastructure to the Java core logging facilities. On the plus side: this gives you a lot more control over the logging. On the minus: you have to deal with the core logging facilities.</p>
               <p id="p4">Near as I can tell, this has to be done with a properties file. Here's what I do: I set the system property <span class="property">java.util.logging.config.file</span> to point to my own logging configuration file, <tt class="filename">/Users/ndw/java/logging.properties</tt>.</p>
               <p id="p5">My logging properties file, constructed mostly through trial and error, is shown below. The important bits are:</p>
               <div class="calloutlist">
                  <table border="0" summary="Callout list">
                     <tr class="callout-row">
                        <td class="callout-bug" valign="baseline" align="left" rowspan="1" colspan="1">
                           <p>
                              <a href="#conslevel" shape="rect">
                                 <img alt="1" border="0" src="/graphics/callouts/1.png"/>
                              </a>  </p>
                        </td>
                        <td class="callout-body" valign="baseline" align="left" rowspan="1" colspan="1">
                           <p id="p6">There are two places to control the amount of detail in the logs. By default, the console handler won't print any messages more detailed than “<tt class="literal">INFO</tt>”. If you want to get more detail, you have to turn this knob appropriately.</p>
                        </td>
                     </tr>
                     <tr class="callout-row">
                        <td class="callout-bug" valign="baseline" align="left" rowspan="1" colspan="1">
                           <p>
                              <a href="#consfmt" shape="rect">
                                 <img alt="2" border="0" src="/graphics/callouts/2.png"/>
                              </a>  </p>
                        </td>
                        <td class="callout-body" valign="baseline" align="left" rowspan="1" colspan="1">
                           <p id="p7">I really dislike the default message format, so I wrote a formatter that produces slightly more compact output. You might like it too.</p>
                        </td>
                     </tr>
                     <tr class="callout-row">
                        <td class="callout-bug" valign="baseline" align="left" rowspan="1" colspan="1">
                           <p>
                              <a href="#calabashlevel" shape="rect">
                                 <img alt="3" border="0" src="/graphics/callouts/3.png"/>
                              </a>  </p>
                        </td>
                        <td class="callout-body" valign="baseline" align="left" rowspan="1" colspan="1">
                           <p id="p8">This is the other place where you can control the amount of detail. “<tt class="literal">ALL</tt>” gives you all the messages. You might prefer “<tt class="literal">SEVERE</tt>”, which gives you only the fatal errors.</p>
                           <p id="p9">You can be selective here. If, for some reason, you want to see the gory detail of the XInclude step, but ignore everything else, you could set:</p>
                           <div class="programlisting">
                              <pre xml:space="preserve">
com.xmlcalabash.level=SEVERE
com.xmlcalabash.XInclude=FINEST
</pre>
                           </div>
                        </td>
                     </tr>
                  </table>
               </div>
               <p id="p10">I'll try to provide better documentation for the available names.</p>
               <p id="p11">Here's my current logging properties file:</p>
               <div class="programlisting">
                  <pre xml:space="preserve">
############################################################
# Logging Configuration File
#
# You can use a different file by specifying a filename
# with the java.util.logging.config.file system property.  
# For example java -Djava.util.logging.config.file=myfile
############################################################

############################################################
#       Global properties
############################################################

# "handlers" specifies a comma separated list of log Handler 
# classes.  These handlers will be installed during VM startup.
# Note that these classes must be on the system classpath.
# By default we only configure a ConsoleHandler, which will only
# show messages at the INFO and above levels.
handlers=java.util.logging.ConsoleHandler

# To also add the FileHandler, use the following line instead.
#handlers= java.util.logging.FileHandler, java.util.logging.ConsoleHandler

# Default global logging level.
# This specifies which kinds of events are logged across
# all loggers.  For any given facility this global level
# can be overriden by a facility specific level
# Note that the ConsoleHandler also has a separate level
# setting to limit messages printed to the console.
.level = ALL

############################################################
# Handler specific properties.
# Describes specific configuration info for Handlers.
############################################################

# default file output is in user's home directory.
java.util.logging.FileHandler.pattern = %h/java%u.log
java.util.logging.FileHandler.limit = 50000
java.util.logging.FileHandler.count = 1
java.util.logging.FileHandler.formatter = java.util.logging.XMLFormatter

# Limit the message that are printed on the console to INFO and above.
java.util.logging.ConsoleHandler.level=FINE<a name="conslevel" id="conslevel" shape="rect"/><img alt="1" border="0" src="/graphics/callouts/1.png"/>
java.util.logging.ConsoleHandler.formatter=com.xmlcalabash.util.LogFormatter<a name="consfmt" id="consfmt" shape="rect"/><img alt="2" border="0" src="/graphics/callouts/2.png"/>

############################################################
# Facility specific properties.
# Provides extra control for each logger.
############################################################

com.xmlcalabash.level=ALL<a name="calabashlevel" id="calabashlevel" shape="rect"/><img alt="3" border="0" src="/graphics/callouts/3.png"/>
</pre>
               </div>
               <p id="p12">I'm fully open to suggestions for better approaches.</p>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>XML Summer School ’09</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/10/05/xmlss09"/>
      <id>http://norman.walsh.name/2009/10/05/xmlss09</id>
      <published>2009-10-05T23:36:02Z</published>
      <updated>2009-10-06T13:19:03Z</updated>
      <category term="photographs" scheme="http://technorati.com/tag/"/>
      <dc:subject>Photography</dc:subject>
      <category term="travel" scheme="http://technorati.com/tag/"/>
      <dc:subject>Travel</dc:subject>
      <category term="xml" scheme="http://technorati.com/tag/"/>
      <dc:subject>XML</dc:subject>
      <category term="xmlss09" scheme="http://technorati.com/tag/"/>
      <dc:subject>XMLSummerSchool2009</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Open source and web technologies at XML Summer School.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/10/05/xmlss09">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Open source and web technologies at XML Summer School.</p>
            </div>
            <p id="p1">Speaking of travelling, a bit of a non sequitor if you aren't reading these essays sequentially, I had just returned from <a href="http://xmlsummerschool.com/" shape="rect">XML Summer School</a> the day before my <a href="http://norman.walsh.name/2009/10/05/dominicanrepublic"
                  title="Dominican Republic"
                  shape="rect">Caribbean jaunt</a>.</p>
            <p id="p2">I was delighted to see the return of XML Summer School (September is summer somewhere, I'm sure) and even more delighted to be invited to speak.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3987058416/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2506/3987058416_3dcf5519ee.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 210px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a> 
                     <a href="http://maps.google.com/maps?ll=51.7535027777778,-1.25336666666667&amp;z=16&amp;t=k"
                        shape="rect">
                        <img border="0" alt="[Google maps]" src="/graphics/map.png"/>
                     </a>
                  </div>
                  <h3>Radcliffe Camera</h3>
               </div>
            </div>
            <p id="p3">I taught in the “Open Source” and “Web 2.0” tracks, presenting <a href="examples/opensource.pdf" shape="rect">Open Source Application Development</a> and <a href="examples/web20.pdf" shape="rect">Building Dynamic Web Applications with XML</a>, respectively. I think the sessions went well. (After more than ten years of doing it, I have finally (and suddenly, this year) reached the point where it doesn't make me gut knotting nervous to give presentations. That has to have improved my stage presence.)</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 334px">
                     <a href="http://www.flickr.com/photos/ndw/3986299811/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2513/3986299811_f42defb144.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 127px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a> 
                     <a href="http://maps.google.com/maps?ll=51.7527305555556,-1.24996111111111&amp;z=16&amp;t=k"
                        shape="rect">
                        <img border="0" alt="[Google maps]" src="/graphics/map.png"/>
                     </a>
                  </div>
                  <h3>Oxford sunset</h3>
               </div>
            </div>
            <p id="p4">Beyond the specific courses, what sets Summer School apart is the staggering array of talent lined up to teach. If you've got a question or a problem about something even tangentially related to markup, there's someone at XML Summer School who's thought about it, built it, or deployed it. Probably all three. That we're all good friends, engaging students not just in the classroom but also punting and pub crawling, is just gravy for everyone.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3987049542/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2659/3987049542_5323b9ca45.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 210px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a> 
                     <a href="http://maps.google.com/maps?ll=51.7482111111111,-1.24675555555556&amp;z=16&amp;t=k"
                        shape="rect">
                        <img border="0" alt="[Google maps]" src="/graphics/map.png"/>
                     </a>
                  </div>
                  <h3>Punters</h3>
               </div>
            </div>
            <p id="p5">A bargain at twice the price, I promise.</p>
            <p id="p6">After Summer School, I snagged a weekend with my folks.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 334px">
                     <a href="http://www.flickr.com/photos/ndw/3987064130/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2631/3987064130_3dc00cb314.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 142px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Got seeds?</h3>
               </div>
            </div>
            <p id="p7">A relaxing end to a great week not without hard work and long days.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Dominican Republic</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/10/05/dominicanrepublic"/>
      <id>http://norman.walsh.name/2009/10/05/dominicanrepublic</id>
      <published>2009-10-05T22:51:30Z</published>
      <updated>2009-10-06T13:19:03Z</updated>
      <category term="dominicanrepublic" scheme="http://technorati.com/tag/"/>
      <dc:subject>DominicanRepublic</dc:subject>
      <category term="photographs" scheme="http://technorati.com/tag/"/>
      <dc:subject>Photography</dc:subject>
      <category term="travel" scheme="http://technorati.com/tag/"/>
      <dc:subject>Travel</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>A long weekend in the Dominican Republic brings me to country number 15.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/10/05/dominicanrepublic">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>A long weekend in the Dominican Republic brings me to country number 15.</p>
            </div>
            <p id="p1">The Dominican Republic is <a href="http://norman.walsh.name/2008/09/14/antigua" title="Antigua and Barbuda"
                  shape="rect">country number fifteen</a> for me. Deb had to work, and I did <a href="http://norman.walsh.name/2008/projects/calabash"
                  title="XML Calabash: an XProc implementation"
                  shape="rect">a little work</a> too, but mostly I was free to enjoy the sun and the sea and the sand.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3986321471/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2575/3986321471_1c16a5df51.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 210px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a> 
                     <a href="http://maps.google.com/maps?ll=18.7383638888889,-68.4796722222222&amp;z=16&amp;t=k"
                        shape="rect">
                        <img border="0" alt="[Google maps]" src="/graphics/map.png"/>
                     </a>
                  </div>
                  <h3>Punta Cana Beach</h3>
               </div>
            </div>
            <p id="p2">We don't live in the Caribbean, remind me why.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3986319793/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm4.static.flickr.com/3496/3986319793_06d61d6530.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 210px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a> 
                     <a href="http://maps.google.com/maps?ll=18.7365611111111,-68.4788416666667&amp;z=16&amp;t=k"
                        shape="rect">
                        <img border="0" alt="[Google maps]" src="/graphics/map.png"/>
                     </a>
                  </div>
                  <h3>Moon Palace Pool</h3>
               </div>
            </div>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 334px">
                     <a href="http://www.flickr.com/photos/ndw/3987077580/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2533/3987077580_b1cac4c6e4.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 127px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a> 
                     <a href="http://maps.google.com/maps?ll=18.7393833333333,-68.4817111111111&amp;z=16&amp;t=k"
                        shape="rect">
                        <img border="0" alt="[Google maps]" src="/graphics/map.png"/>
                     </a>
                  </div>
                  <h3>Footprints</h3>
               </div>
            </div>
            <p id="p3">The Moon Palace resort was lovely (in a big “cruise ship on land” sort of way), and this is not a criticism, just an observation, but clearly there are no <a href="http://en.wikipedia.org/wiki/Occupational_Safety_and_Health_Administration"
                  title="Wikipedia: Occupational Safety and Health Administration"
                  shape="rect">OSHA</a> inspectors in the Dominican Republic.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3987081082/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2537/3987081082_686872a273.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>OSHA? What's that?</h3>
               </div>
            </div>
            <p id="p4">Or maybe there are no personal injury lawyers, I don't know. The tripping hazard in our room was not the only structural feature that made me do a double take.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>SQL to XML</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/09/26/sqltoxml"/>
      <id>http://norman.walsh.name/2009/09/26/sqltoxml</id>
      <published>2009-09-26T12:01:23Z</published>
      <updated>2009-09-26T16:05:00Z</updated>
      <category term="osx" scheme="http://technorati.com/tag/"/>
      <dc:subject>OSX</dc:subject>
      <category term="xml" scheme="http://technorati.com/tag/"/>
      <dc:subject>XML</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>A number of Mac applications store information in SQLite databases. Step one to do something useful with that data is to get it into XML.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/09/26/sqltoxml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>A number of Mac applications store information in SQLite databases. Step one to do something useful with that data is to get it into XML.</p>
            </div>
            <p id="p1">I hate having my data squirreled away in proprietary or quasi-proprietary ways. If I can't get my information back out of an app, I'd rather not use it. When I started using the Mac, I switched to <span class="application">iCal</span> and <span class="application">AddressBook</span>: both can export data in standard text formats which I can easily convert to XML.</p>
            <p id="p2">But exporting the data is a manual process (though I could probably automate it with some clever AppleScript or something, I've never tried). I build a number of views of my address book and calendar data automatically so manual processes don't fit well into my workflow.</p>
            <p id="p3">It didn't take too long to figure out where <span class="application">iCal</span> stores my appointments or how to pull them together. Having worked out where <span class="application">iCal</span> stores my appointments, I turned my attention to <span class="application">AddressBook</span>.</p>
            <p id="p4">Long story short: the address data is saved in a database in <tt class="filename">~/Library/Application Support/AddressBook/AddressBook-v22.abcddb</tt>. After installing the <span class="application">sqlite3</span> application from MacPorts, I was able to extract a text dump. So far so good.</p>
            <p id="p5">Here, for example, is a table definition and the first row of data in that table:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
CREATE TABLE ZABCDRECORD ( Z_PK INTEGER PRIMARY KEY, Z_ENT INTEGER,
Z_OPT INTEGER, ZDISPLAYFLAGS INTEGER, ZMODIFICATIONDATEYEAR INTEGER,
ZCREATIONDATEYEAR INTEGER, ZADDRESSBOOKSOURCE INTEGER, ZISALL INTEGER,
ZME INTEGER, Z19_ME INTEGER, ZINFO INTEGER, ZBIRTHDAYYEAR INTEGER,
ZPRIVACYFLAGS INTEGER, ZNOTE INTEGER, ZADDRESSBOOKSOURCE1 INTEGER,
ZCONTACTINDEX INTEGER, ZSOURCEWHERECONTACTISME INTEGER, ZVERSION
INTEGER, ZSYNCCOUNT INTEGER, ZSHARECOUNT INTEGER, ZADDRESSBOOKSOURCE2
INTEGER, ZMODIFICATIONDATE TIMESTAMP, ZCREATIONDATE TIMESTAMP,
ZMODIFICATIONDATEYEARLESS FLOAT, ZCREATIONDATEYEARLESS FLOAT,
ZBIRTHDAY TIMESTAMP, ZBIRTHDAYYEARLESS FLOAT, ZUNIQUEID VARCHAR, ZNAME
VARCHAR, ZNAMENORMALIZED VARCHAR, ZTMPREMOTELOCATION VARCHAR, ZNAME1
VARCHAR, ZREMOTELOCATION VARCHAR, ZSERIALNUMBER VARCHAR, ZSUFFIX
VARCHAR, ZTITLE VARCHAR, ZTMPHOMEPAGE VARCHAR, ZNICKNAME VARCHAR,
ZORGANIZATION VARCHAR, ZMAIDENNAME VARCHAR, ZIDENTITYUNIQUEID VARCHAR,
ZPHONETICFIRSTNAME VARCHAR, ZDEPARTMENT VARCHAR, ZPHONETICLASTNAME
VARCHAR, ZMIDDLENAME VARCHAR, ZFIRSTNAME VARCHAR, ZIMAGEREFERENCE
VARCHAR, ZJOBTITLE VARCHAR, ZPHONETICMIDDLENAME VARCHAR, ZLASTNAME
VARCHAR, ZSORTINGFIRSTNAME VARCHAR, ZSORTINGLASTNAME VARCHAR,
ZCREATEDVERSION VARCHAR, ZLASTDOTMACACCOUNT VARCHAR, ZLASTSAVEDVERSION
VARCHAR, ZSYNCANCHOR VARCHAR, ZSEARCHELEMENTDATA BLOB,
ZMODIFIEDUNIQUEIDSDATA BLOB );
INSERT INTO "ZABCDRECORD" VALUES(1,18,287,NULL,NULL,2008,NULL,1,NULL,NULL,
3,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
239119683.670433,NULL,18281283.670433,NULL,NULL,
'93973926-7EF6-40F0-ADBD-8C7BBFC30FA1:ABSubscriptionRecord',NULL,
NULL,NULL,NULL,'local','B7303AAD-DA79-46E6-BC7D-91DAD82AEFB8',
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL);
</pre>
            </div>
            <p id="p6">Next, I wrote <a href="examples/sqltoxml" shape="rect">150 or so lines</a> of <a href="http://en.wikipedia.org/wiki/Perl" title="Wikipedia: Perl" shape="rect">Perl</a> to convert the text into XML.</p>
            <p id="p7">The XML is designed to be a totally straightforward representation of the table structure of the database. From the preceding SQL statements, <span class="application">sqltoxml</span> produces:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;table name='ZABCDRECORD'&gt;
&lt;columns&gt;
  &lt;column name='Z_PK' type='INTEGER'/&gt;
  &lt;column name='Z_ENT' type='INTEGER'/&gt;
  &lt;column name='Z_OPT' type='INTEGER'/&gt;
  &lt;column name='ZDISPLAYFLAGS' type='INTEGER'/&gt;
  &lt;column name='ZMODIFICATIONDATEYEAR' type='INTEGER'/&gt;
  &lt;column name='ZCREATIONDATEYEAR' type='INTEGER'/&gt;
  &lt;column name='ZADDRESSBOOKSOURCE' type='INTEGER'/&gt;
  &lt;column name='ZISALL' type='INTEGER'/&gt;
  &lt;column name='ZME' type='INTEGER'/&gt;
  &lt;column name='Z19_ME' type='INTEGER'/&gt;
  &lt;column name='ZINFO' type='INTEGER'/&gt;
  &lt;column name='ZBIRTHDAYYEAR' type='INTEGER'/&gt;
  &lt;column name='ZPRIVACYFLAGS' type='INTEGER'/&gt;
  &lt;column name='ZNOTE' type='INTEGER'/&gt;
  &lt;column name='ZADDRESSBOOKSOURCE1' type='INTEGER'/&gt;
  &lt;column name='ZCONTACTINDEX' type='INTEGER'/&gt;
  &lt;column name='ZSOURCEWHERECONTACTISME' type='INTEGER'/&gt;
  &lt;column name='ZVERSION' type='INTEGER'/&gt;
  &lt;column name='ZSYNCCOUNT' type='INTEGER'/&gt;
  &lt;column name='ZSHARECOUNT' type='INTEGER'/&gt;
  &lt;column name='ZADDRESSBOOKSOURCE2' type='INTEGER'/&gt;
  &lt;column name='ZMODIFICATIONDATE' type='TIMESTAMP'/&gt;
  &lt;column name='ZCREATIONDATE' type='TIMESTAMP'/&gt;
  &lt;column name='ZMODIFICATIONDATEYEARLESS' type='FLOAT'/&gt;
  &lt;column name='ZCREATIONDATEYEARLESS' type='FLOAT'/&gt;
  &lt;column name='ZBIRTHDAY' type='TIMESTAMP'/&gt;
  &lt;column name='ZBIRTHDAYYEARLESS' type='FLOAT'/&gt;
  &lt;column name='ZUNIQUEID' type='VARCHAR'/&gt;
  &lt;column name='ZNAME' type='VARCHAR'/&gt;
  &lt;column name='ZNAMENORMALIZED' type='VARCHAR'/&gt;
  &lt;column name='ZTMPREMOTELOCATION' type='VARCHAR'/&gt;
  &lt;column name='ZNAME1' type='VARCHAR'/&gt;
  &lt;column name='ZREMOTELOCATION' type='VARCHAR'/&gt;
  &lt;column name='ZSERIALNUMBER' type='VARCHAR'/&gt;
  &lt;column name='ZSUFFIX' type='VARCHAR'/&gt;
  &lt;column name='ZTITLE' type='VARCHAR'/&gt;
  &lt;column name='ZTMPHOMEPAGE' type='VARCHAR'/&gt;
  &lt;column name='ZNICKNAME' type='VARCHAR'/&gt;
  &lt;column name='ZORGANIZATION' type='VARCHAR'/&gt;
  &lt;column name='ZMAIDENNAME' type='VARCHAR'/&gt;
  &lt;column name='ZIDENTITYUNIQUEID' type='VARCHAR'/&gt;
  &lt;column name='ZPHONETICFIRSTNAME' type='VARCHAR'/&gt;
  &lt;column name='ZDEPARTMENT' type='VARCHAR'/&gt;
  &lt;column name='ZPHONETICLASTNAME' type='VARCHAR'/&gt;
  &lt;column name='ZMIDDLENAME' type='VARCHAR'/&gt;
  &lt;column name='ZFIRSTNAME' type='VARCHAR'/&gt;
  &lt;column name='ZIMAGEREFERENCE' type='VARCHAR'/&gt;
  &lt;column name='ZJOBTITLE' type='VARCHAR'/&gt;
  &lt;column name='ZPHONETICMIDDLENAME' type='VARCHAR'/&gt;
  &lt;column name='ZLASTNAME' type='VARCHAR'/&gt;
  &lt;column name='ZSORTINGFIRSTNAME' type='VARCHAR'/&gt;
  &lt;column name='ZSORTINGLASTNAME' type='VARCHAR'/&gt;
  &lt;column name='ZCREATEDVERSION' type='VARCHAR'/&gt;
  &lt;column name='ZLASTDOTMACACCOUNT' type='VARCHAR'/&gt;
  &lt;column name='ZLASTSAVEDVERSION' type='VARCHAR'/&gt;
  &lt;column name='ZSYNCANCHOR' type='VARCHAR'/&gt;
  &lt;column name='ZSEARCHELEMENTDATA' type='BLOB'/&gt;
  &lt;column name='ZMODIFIEDUNIQUEIDSDATA' type='BLOB'/&gt;
&lt;/columns&gt;
&lt;rows&gt;
  &lt;row&gt;
    &lt;Z_PK&gt;1&lt;/Z_PK&gt;
    &lt;Z_ENT&gt;18&lt;/Z_ENT&gt;
    &lt;Z_OPT&gt;287&lt;/Z_OPT&gt;
    &lt;ZCREATIONDATEYEAR&gt;2008&lt;/ZCREATIONDATEYEAR&gt;
    &lt;ZISALL&gt;1&lt;/ZISALL&gt;
    &lt;ZINFO&gt;3&lt;/ZINFO&gt;
    &lt;ZCREATIONDATE&gt;239119683.670433&lt;/ZCREATIONDATE&gt;
    &lt;ZCREATIONDATEYEARLESS&gt;18281283.670433&lt;/ZCREATIONDATEYEARLESS&gt;
    &lt;ZUNIQUEID&gt;93973926-7EF6-40F0-ADBD-8C7BBFC30FA1:ABSubscriptionRecord&lt;/ZUNIQUEID&gt;
    &lt;ZREMOTELOCATION&gt;local&lt;/ZREMOTELOCATION&gt;
    &lt;ZSERIALNUMBER&gt;B7303AAD-DA79-46E6-BC7D-91DAD82AEFB8&lt;/ZSERIALNUMBER&gt;
  &lt;/row&gt;
  &lt;!-- ... --&gt;
&lt;/rows&gt;
&lt;/table&gt;
</pre>
            </div>
            <p id="p8">As you can see, I've made no effort to maintain some aspects of the database (like the primary key), I've simply dropped NULL fields, and I'm relying on the field names to be valid XML NCNames. There are clearly other, equally reasonable, design choices that I could have made.</p>
            <p id="p9">The resulting XML is the bare minimum needed to switch to XML tools for subsequent downstream processing (turning address book tables into VCards, for example). But it gets the job done.</p>
            <p id="p10">And maybe it'll come in handy for someone else.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>RDFa for DocBook?</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/09/22/RDFaForDocBook"/>
      <id>http://norman.walsh.name/2009/09/22/RDFaForDocBook</id>
      <published>2009-09-22T13:20:39Z</published>
      <updated>2009-09-26T16:04:15Z</updated>
      <category term="docbook" scheme="http://technorati.com/tag/"/>
      <dc:subject>DocBook</dc:subject>
      <category term="rdf" scheme="http://technorati.com/tag/"/>
      <dc:subject>RDF</dc:subject>
      <category term="xmlss09" scheme="http://technorati.com/tag/"/>
      <dc:subject>XMLSummerSchool2009</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Adding RDFa to DocBook would make it possible to add a class of semantic annotations to DocBook without changing the schema. But is that a good idea?</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/09/22/RDFaForDocBook">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Adding RDFa to DocBook would make it possible to add a class of semantic annotations to DocBook without changing the schema. But is that a good idea?</p>
            </div>
            <div class="epigraph">
               <p id="p2">Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information on it.</p>
               <div class="attribution">
                  <span class="mdash">—</span>
                  <span class="personname">
                     <span class="firstname">Samuel</span> 
                     <span class="surname">Johnson</span>
                  </span>
               </div>
            </div>
            <p id="p1">When <span class="personname">
                  <span class="firstname">Bob</span> 
                  <span class="surname">DuCharme</span>
               </span> introduced the semantic web track at <a href="http://www.xmlsummerschool.com/" shape="rect">XML Summer School</a> this morning, he mentioned briefly the idea of adding <a href="http://en.wikipedia.org/wiki/RDFa" title="Wikipedia: RDFa" shape="rect">RDFa</a> to vocabularies other than (X)HTML. In particular, he's investigated how to <a href="http://www.devx.com/semantic/Article/42543/0/page/3" shape="rect">do it in
DocBook</a>.</p>
            <p id="p3">The DocBook TC gets periodic requests to add new inline elements and attributes for bits of metadata. Sometimes the requests are entirely legitimate, in the sense that they're clearly about technical documentation, but seem to apply to such a small audience that the TC is reluctant to add them to all of DocBook.</p>
            <p id="p4">With this in mind, the idea of adding RDFa has some appeal: we add a few new attributes and henceforth users will be able to add new bits of metadata without having to change the DocBook schema.</p>
            <p id="p5">But I'm not sure.</p>
            <p id="p6">First, lots of DocBook elements have more discrete semantics than HTML elements. We don't need to say</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;phrase property="dc:title"&gt;Beautiful Sunset&lt;/phrase&gt;
</pre>
            </div>
            <p id="p7">because we have <tt class="tag-starttag">&lt;citetitle&gt;</tt>. We don't need to say:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;info&gt;
  &lt;bibliomisc&gt;
    &lt;phrase rel="mpc:editor" href="http://mypubco.com/empid/53234"/&gt;
  &lt;/bibliomisc&gt;
&lt;/info&gt;
</pre>
            </div>
            <p id="p8">because we have</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;info&gt;
  &lt;editor role="mpc:editor"&gt;
    &lt;personname&gt;Some Name&lt;/personname&gt;
    &lt;uri&gt;http://mypubco.com/empid/53234&lt;/uri&gt;
  &lt;/editor&gt;
&lt;/info&gt;
</pre>
            </div>
            <p id="p9">I'm not suggesting those are <em>exactly</em> the same, they're clearly not, but I'm comfortable that existing DocBook elements are sufficient for the task.</p>
            <p id="p10">(Yes, you'd need a DocBook-specific tool to extract the metadata, which is a disadvantage, but you probably want one anyway for the existing DocBook semantics.)</p>
            <p id="p11">Second, it would allow you to construct statements with conflicting or, at best, odd semantics:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;section&gt;
  &lt;title property="dc:creator"&gt;Alice1&lt;/title&gt;
  &lt;para xml:id='p12'&gt;This is from section 2.2.&lt;/para&gt;
&lt;/section&gt;
</pre>
            </div>
            <p id="p13">I can just about imagine a sense in which “Alice1” can be both the title of a section and the <a href="http://en.wikipedia.org/wiki/Dublin%20Core"
                  title="Wikipedia: Dublin Core"
                  shape="rect">Dublin Core</a> creator of the section, but it doesn't make a lot of sense.</p>
            <p id="p14">Third, Bob's example seems to suggest that it would encourage markup like this:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;para about="/alice/posts/trouble_with_bob" xml:id='p15'&gt;
  &lt;phrase property="dc:title"&gt;The trouble with Bob2&lt;/phrase&gt;
  &lt;phrase property="dc:creator"&gt;Alice2&lt;/phrase&gt;
&lt;/para&gt;
</pre>
            </div>
            <p id="p16">which seems like a bad idea to me.</p>
            <p id="p17">On the other hand, some of the examples do seem useful for exactly the sort of thing I suggested motivated my interest:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;bibliomisc property="mpc:lastScreenShotDate" content="2009-08-01T15:31:00"/&gt;
&lt;bibliomisc property="mpc:softwareRelease"    content="3.1"/&gt;
</pre>
            </div>
            <p id="p18">In fairness, Bob set out to recreate the triples from the original tutorial, so some of the markup choices were forced upon him.</p>
            <p id="p19">So I'm not sure.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Micro-blogging Backup, part the fourth</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/09/09/mbb04"/>
      <id>http://norman.walsh.name/2009/09/09/mbb04</id>
      <published>2009-09-09T15:11:49Z</published>
      <updated>2009-09-09T16:32:34Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <category term="microblogging" scheme="http://technorati.com/tag/"/>
      <dc:subject>Microblogging</dc:subject>
      <category term="www" scheme="http://technorati.com/tag/"/>
      <dc:subject>TheWeb</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>In which we get to see what our tweets and ’dents look like.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/09/09/mbb04">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>In which we get to see what our tweets and ’dents look like.</p>
            </div>
            <p id="p1">If you haven't been following along, go back and read parts <a href="http://norman.walsh.name/2009/08/27/mbb01"
                  title="Micro-blogging Backup, part the first"
                  shape="rect">one</a>, <a href="http://norman.walsh.name/2009/08/28/mbb02"
                  title="Micro-blogging Backup, part the second"
                  shape="rect">two</a>, and for a little background, <a href="http://norman.walsh.name/2009/09/03/mbb03"
                  title="Micro-blogging Backup, part the third"
                  shape="rect">three</a> first. Now you've got <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> up and running and you've been able to download your tweets and the tweets of those you follow. (Tweets or ’dents depending on which microblogging service you prefer; either, actually both, work for me.)</p>
            <p id="p2">Next, download <a href="examples/mbb04.zip" shape="rect">mbb04.zip</a> and unpack it in the same place where you unpacked <a href="/2009/08/28/examples/mbb02.zip" shape="rect">mbb02.zip</a>. If you were following the instructions, you've edited some of the files in the “<tt class="filename">mbb/inst</tt>” directory and you may have some sessions saved in CQ, so I <em>have not</em> included those directories in <tt class="filename">mbb04.zip</tt>.</p>
            <p id="p3">If you've been tinkering with other files, then you want to unpack this zip with some care or you may overwrite your changes. But you won't overwrite changes made to the installation or CQ areas. (By the same token, this distribution is incomplete without <tt class="filename">mbb02.zip</tt>).</p>
            <p id="p4">With that installed, you now have a CSS file, four modules, and a new “top level” script, <tt class="filename">show-tweets.xqy</tt>. That's the fun one. Point your browser at <a href="http://localhost:8330/show-tweets.xqy" shape="rect">http://localhost:8330/show-tweets.xqy</a> and you should be rewarded with a list of your status messages from today. (As before, adjust the port number as necessary if you installed the application server on a different port.)</p>
            <p id="p5">If you don't have any status messages from today, load <a href="http://localhost:8330/get-tweets.xqy" shape="rect">http://localhost:8330/get-tweets.xqy</a> to download your most recent messages, then try <a href="http://localhost:8330/show-tweets.xqy" shape="rect">http://localhost:8330/show-tweets.xqy</a> again.</p>
            <p id="p6">The <tt class="filename">show-tweets.xqy</tt> script accepts query parameters. You can pass:</p>
            <div class="variablelist">
               <dl>
                  <dt>
                     <tt class="literal">sdate</tt>
                  </dt>
                  <dd>
                     <p id="p7">To specify the starting date in “YYYY-MM-DD” format. If unspecified, defaults to the ending date.</p>
                  </dd>
                  <dt>
                     <tt class="literal">edate</tt>
                  </dt>
                  <dd>
                     <p id="p8">To specify the ending date. If unspecified, defaults to today.</p>
                  </dd>
                  <dt>
                     <tt class="literal">users</tt>
                  </dt>
                  <dd>
                     <p id="p9">To specify one or more users separated by spaces (<tt class="literal">+</tt> signs or <tt class="literal">%20</tt>’s in URI-speak). These should match the <tt class="literal">screen_name</tt> values in your account configuration. The value “<tt class="literal">ALL</tt>” is special, it will list tweets for <em>every</em> user, including all your friends.</p>
                  </dd>
                  <dt>
                     <tt class="literal">service</tt>
                  </dt>
                  <dd>
                     <p id="p10">To specify the service. If you setup accounts on multiple services, this will let you limit the result to only those messages on a single service.</p>
                  </dd>
               </dl>
            </div>
            <p id="p11">Go ahead and give it a try, <a href="http://localhost:8330/show-tweets.xqy?sdate=2009-08-01&amp;edate=2009-08-31&amp;users=ALL"
                  shape="rect">http://localhost:8330/show-tweets.xqy?sdate=2009-08-01&amp;edate=2009-08-31&amp;users=ALL</a> will show you all the messages by you and those you follow posted in the month of August.</p>
            <div class="section">
               <h2 class="runin">Message Formatting </h2>
               <p class="runin" id="p12">
                  <a id="display" name="display" shape="rect"/>The messages are sorted in ascending order by date, so that messages read chronologically “down” the page as you'd expect. However, conversations are handled a little bit specially.</p>
               <p id="p13">Whenever a message is encountered that is part of a conversation (either because it's a reply to another message, or another message exists that is in reply to it), the whole thread is collected together and presented as a unit, like this:</p>
               <div class="artwork">
                  <div class="flickr-photo">
                     <div class="photo" style="width: 500px">
                        <a href="http://www.flickr.com/photos/ndw/3904233170/" shape="rect">
                           <img border="0" alt="[Photo]"
                                src="http://farm3.static.flickr.com/2643/3904233170_f112bd175e.jpg"/>
                        </a>
                     </div>
                     <div class="link" style="left: 225px;">
                        <a href="http://www.flickr.com/" shape="rect">
                           <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                        </a>
                     </div>
                     <h3>Message threading</h3>
                  </div>
               </div>
               <p id="p14">I think that makes the results much easier to follow. We'll come back to what to do about threads involving users other than you or your followers later.</p>
               <div class="admonition">
                  <table border="0" cellspacing="0" cellpadding="4"
                         summary="Presentation of a admonition">
                     <tbody>
                        <tr>
                           <td valign="top" rowspan="1" colspan="1">
                              <span class="admon-graphic">
                                 <img alt="Note" src="/graphics/note.png"/>
                              </span>
                           </td>
                           <td rowspan="1" colspan="1">
                              <div class="admon-text">
                                 <p id="p15">There were some bugs in the Identi.ca server that occasionally caused incorrect “in-reply-to” values to be inserted into the data. I think those <a href="http://identi.ca/notice/9260164" shape="rect">have been fixed</a> in the server, but there's nothing I can do about the values that are wrong.</p>
                              </div>
                           </td>
                        </tr>
                     </tbody>
                  </table>
               </div>
            </div>
            <div class="section">
               <h2 class="runin">URL Rewriting </h2>
               <p class="runin" id="p16">
                  <a id="rewriting" name="rewriting" shape="rect"/>Those <tt class="filename">show-tweets.xqy</tt> URLs may be sufficient, but they're hardly elegant. I'd be much happier if they were better organized for human consumption.</p>
               <p id="p17">Luckily, with the URL rewriting features of MarkLogic Server V4.1, this is easily achieved. To begin with, go back into the admin console (<a href="http://localhost:8001/" shape="rect">http://localhost:8001/</a>) and navigate down through Groups→Default→App Servers in the tree control then select the server you setup for this project.</p>
               <p id="p18">Near the bottom of that page, you'll find a “url rewriter” field. Specify <tt class="literal">modules/url-rewriter.xqy</tt> as the value for that field and then click “Ok” at either the top or bottom of the page.</p>
               <p id="p19">The one I provided supports a range of values designed to be more readable:</p>
               <div class="variablelist">
                  <dl>
                     <dt>
                        <tt class="uri">/my-tweets/<em class="replaceable">
                              <tt class="replaceable">service</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">users</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">end-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p20">Returns the messages associated with “users” on “service” between the specified start and end dates, e.g., <tt class="uri">/my-tweets/identica/ndw/2009-08-01/2009-08-15</tt>.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/my-tweets/<em class="replaceable">
                              <tt class="replaceable">users</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">end-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p21">Returns the messages associated with “users” on any service between the specified start and end dates, e.g., <tt class="uri">/my-tweets/ndw/2009-08-01/2009-08-15</tt>.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/my-tweets/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">end-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p22">Returns the messages associated with any user on any service between the specified start and end dates, e.g., <tt class="uri">/my-tweets/2009-08-01/2009-08-15</tt>.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/my-tweets/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p23">Returns the messages associated with any user on any service between the specified start date and today, e.g., <tt class="uri">/my-tweets/2009-09-01</tt>.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/my-tweets</tt>
                     </dt>
                     <dd>
                        <p id="p24">Returns the messages associated with any user on any service posted today.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/all-tweets/<em class="replaceable">
                              <tt class="replaceable">service</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">users</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">end-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p25">Returns the messages posted by anyone on “service” between the specified start and end dates.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/all-tweets/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">end-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p26">Returns the messages posted by anyone on any service between the specified start and end dates.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/all-tweets/<em class="replaceable">
                              <tt class="replaceable">service</tt>
                           </em>/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p27">Returns the messages posted by anyone on “service” between the specified start date and today.</p>
                     </dd>
                     <dt>
                        <tt class="uri">/all-tweets/<em class="replaceable">
                              <tt class="replaceable">start-date</tt>
                           </em>
                        </tt>
                     </dt>
                     <dd>
                        <p id="p28">Returns the messages posted by anyone on any service between the specified start date and today.</p>
                     </dd>
                  </dl>
               </div>
               <p id="p29">The URL rewriter is just an XQuery that you can change; so it can support any kind of values that you'd like. It recieves the URL that the user entered. Whatever URL it returns is what the server actually responds to. The one I've provided turns</p>
               <div class="screen">
                  <pre xml:space="preserve">
/all-tweets/2009-08-15/2009-08-31
</pre>
               </div>
               <p id="p30">into</p>
               <div class="screen">
                  <pre xml:space="preserve">
/show-tweets.xyq?users=ALL&amp;sdate=2009-08-15&amp;edate=2009-08-31
</pre>
               </div>
               <p id="p31">So nothing else has to change about the code, it all just works.</p>
               <p id="p32">I'm not sure there's much else that's new or interesting about the other modules provided in this part. If you have any questions, feel free to ask.</p>
               <p id="p33">Next time, we'll look at dealing with all those ugly shortened URLs and pesky replies that extend beyond our followers.</p>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Micro-blogging Backup, part the third</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/09/03/mbb03"/>
      <id>http://norman.walsh.name/2009/09/03/mbb03</id>
      <published>2009-09-03T20:12:40Z</published>
      <updated>2009-09-05T17:47:11Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <category term="microblogging" scheme="http://technorati.com/tag/"/>
      <dc:subject>Microblogging</dc:subject>
      <category term="www" scheme="http://technorati.com/tag/"/>
      <dc:subject>TheWeb</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>In which we peel back the covers on what's been built so far.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/09/03/mbb03">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>In which we peel back the covers on what's been built so far.</p>
            </div>
            <p id="p1">There's more functionality to come, but first, I thought it might be useful to spend a few minutes looking at what we've got so far.</p>
            <p id="p2">The setup code in <tt class="filename">/mbb/init</tt> isn't very interesting, and I'm not going to attempt to explain how CQ works, so we'll begin in the <tt class="filename">/mbb/modules</tt> directory.</p>
            <div class="variablelist">
               <dl>
                  <dt>
                     <tt class="filename">accounts.xqy</tt>
                  </dt>
                  <dd>
                     <p id="p3">This module contains some utility and convenience functions for dealing with account data. I changed my mind about how to store the data a couple of times early on, so these functions were supposed to protect me a little bit from that. I didn't follow through all the way so the account abstraction is pretty leaky, but I left this module in place anyway.</p>
                  </dd>
                  <dt>
                     <tt class="filename">twitter.xqy</tt>
                  </dt>
                  <dd>
                     <p id="p4">This module is a thin skin over the actual <a href="http://apiwiki.twitter.com/Twitter-API-Documentation" shape="rect">Twitter API</a>. Ideally, I'd flesh this module out to support the rest of the endpoints, but I haven't bothered yet.</p>
                     <p id="p5">One school of thought on this kind of API module is that it should be as thin as possible, providing only the thinest skin over the underlying API. I mostly agree, but I did take a few liberties. If you wanted to adapt this module for some other purpose, you might have reason to carve it a little closer to the bone.</p>
                     <p id="p6">One decision I made was to have the <tt class="methodname">account/rate_limit_status</tt> method return the number of calls remaining directly as a number, rather than returning the XML response. That's pretty simple. The other changes I made are a bit deeper.</p>
                     <p id="p7">The Twitter timeline methods are designed to be “paged”; the caller can select the page size and the page they want to retreive. I decided that what I really want is <em>all</em> the pages; so my versions of the timeline methods always request all the pages and return all of the results in a single call (by performing the requisite paging for you, behind the scenes). Twitter limits you to 16 pages, but Identi.ca servers seem to offer more pages. In order to avoid recursing beyond the size of the
call stack, I placed an arbitrary limit on the number of pages.</p>
                     <p id="p8">Finally, I decided to protect the caller from exceptions that can occur if the underlying HTTP requests fail. Most of the public methods in <tt class="filename">twitter.xqy</tt> return an element, either the Twitter API response, or a <tt class="tag-starttag">&lt;t:error&gt;</tt> element containing the HTTP error code if an error occurred.</p>
                     <p id="p9">I think an argument could be made for <em>not</em> doing this, for letting the lowest-level API calls throw the exception, but I decided not to. You're free to change that, of course.</p>
                  </dd>
                  <dt>
                     <tt class="filename">twitproc.xqy</tt>
                  </dt>
                  <dd>
                     <p id="p10">This module is mostly responsible for taking Twitter <tt class="tag-starttag">&lt;status&gt;</tt> and <tt class="tag-starttag">&lt;user&gt;</tt> elements and inserting them into the database. Along the way, we transform them just a little:</p>
                     <div class="orderedlist">
                        <ol style="list-style: decimal;">
                           <li>
                              <p id="p11">I move them from no namespace into the “t:” namespace. First, I subscribe to the position that XML vocabularies <a href="http://www.w3.org/TR/webarch/#use-namespaces" shape="rect">should place elements in a namespace</a>. I'm aware that there are people who believe otherwise. They're wrong. Second, <a href="http://en.wikipedia.org/wiki/XQuery" title="Wikipedia: XQuery"
                                    shape="rect">XQuery</a>’s interpretation of unqualified names <a href="http://norman.walsh.name/2008/07/02/xquery#p11" shape="rect">exacerbates the problem</a>. So
you could look at this as patching a bug in the Twitter API.</p>
                           </li>
                           <li>
                              <p id="p12">I transform the contents of the <tt class="tag-starttag">&lt;created_at&gt;</tt> element into ISO 8601 format (so it fits more naturally into the data model).</p>
                           </li>
                           <li>
                              <p id="p13">The Twitter APIs return a <tt class="tag-starttag">&lt;user&gt;</tt> element embedded in each <tt class="tag-starttag">&lt;status&gt;</tt> message. This is probably a net win for limiting round-trip calls to the API, but it doesn't strike me as a very sensible way to store things in the database. I break out the users and store them separately.</p>
                           </li>
                           <li>
                              <p id="p14">I add a few more elments to each status message. These record information about subsequent processing to perform (more about that later), the screen name of the user who uttered the message, and information about who was logged in to retreive this message.</p>
                              <p id="p15">This is a little lazy on my part. Arguably, I should introduce another namespace for these additional elements (so that some future Twitter API change doesn't walk all over them), or maybe not store them <em>in</em> the messages at all. I invite you to fix it if it bothers you.</p>
                              <p id="p16">If you know something about <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a>, this may sound like a job for document properties. That's a good idea, particularly for the downstream processing markers. However, document properties are associated, as the name suggests, with <em>documents</em> in the database. Later on in the code for displaying messages, we're sometimes going to make a copy of the message (giving it a new parent element). Doing that breaks the
association with document properties. I was trying to keep things simple, so I didn't use properties for one set of information and child nodes for another, I just pushed it all into child nodes. My bad.</p>
                           </li>
                        </ol>
                     </div>
                  </dd>
                  <dt>
                     <tt class="filename">update.xqy</tt>
                  </dt>
                  <dd>
                     <p id="p17">This module wraps up the functionality of the <tt class="filename">twitter.xqy</tt> and <tt class="filename">twitproc.xqy</tt> modules, getting all the tweets for a user and inserting them into the database. The code for finding the most recent messages by (and not by) a particular user might be interesting to you. Ignore the <tt class="varname">$tweet-collection</tt> variable; it's a holdover from an earlier approach, no longer used.</p>
                  </dd>
                  <dt>
                     <tt class="filename">get-new-tweets.xqy</tt>
                  </dt>
                  <dd>
                     <p id="p18">This module exists only to be invoked from another module. It declares an external variable that identifies a single account then simply calls the <tt class="function">get-tweets</tt> function from the <tt class="filename">update.xqy</tt> module for that account.</p>
                  </dd>
               </dl>
            </div>
            <p id="p19">The last bit of code that we've got so far is <tt class="filename">get-tweets.xqy</tt> in the top level of the application server. This module loops over all the accounts that we've defined and, for each one, downloads and inserts any new status messages into the database. It does this by invoking the <tt class="filename">get-new-tweets.xqy</tt> module.</p>
            <div class="section">
               <h2 class="runin">What's all this invoking stuff about? </h2>
               <p class="runin" id="p20">
                  <a id="invoke" name="invoke" shape="rect"/>The server takes a completely safe, <a href="http://en.wikipedia.org/wiki/ACID" title="Wikipedia: ACID" shape="rect">transactional</a> approach to database updates. You are guaranteed that every query that updates the database either succeeds in its entirety or fails. One of the things that you aren't allowed to do is make conflicting updates to the same document in the same transaction. You can demonstrate this easily, just run the following expression in
CQ:</p>
               <div class="programlisting">
                  <pre xml:space="preserve">
let $doc := &lt;foo&gt;some document&lt;/foo&gt;
return
  (xdmp:document-insert("/scratch/foo", $doc),
   xdmp:document-insert("/scratch/foo", $doc))
</pre>
               </div>
               <p id="p21">The server will bark “XDMP-CONFLICTINGUPDATES” and no inserts will be made to the database.</p>
               <p id="p22">Why does this matter to us? Well, imagine that you setup two Twitter accounts in our micro-blogging backup system. Imagine further that both of those accounts follow <a href="http://twitter.com/marklogic" shape="rect">marklogic</a>.</p>
               <p id="p23">What's going to happen when we run the backup? Both accounts are going to download all of the status messages on their “friends” timeline, so they're both going to download all of the recent <a href="http://twitter.com/marklogic" shape="rect">marklogic</a> tweets. And they're both going to try to insert them into the database. And that's going to generate a “conflicting updates” error.</p>
               <p id="p24">Using the two-step <tt class="function">xdmp:invoke</tt> dance as shown in <tt class="filename">get-tweets.xqy</tt> and <tt class="filename">get-new-tweets.xqy</tt> avoids this problem. The semantics of <tt class="function">xdmp:invoke</tt> are that it runs the specified module in a <em>separate</em> transaction.</p>
               <p id="p25">Since no single user is going to download the same message twice, each transaction will succeed. In fact, some messages will get updated twice in the database, but that doesn't do any harm because the content of the message will be the same in each case.</p>
               <p id="p26">An alternate approach to this problem is to manage the messages with greater care, identifying duplicates when they occur and not attempting to insert them in the database. This is the approach taken in <tt class="filename">twitproc.xqy</tt> for the simpler problem of dealing with duplicate <tt class="tag-starttag">&lt;user&gt;</tt>s.</p>
               <p id="p27">It would certainly be possible to refactor the code so that the <tt class="function">xdmp:invoke</tt> call could be avoided, but in this case splitting work into several transactions feels like the more elegant solution. And any performance penalties associated with a few calls to <tt class="function">xdmp:invoke</tt> are going to be totally swamped by the latency in the underlying HTTP requests, so there isn't really a downside.</p>
            </div>
            <div class="section">
               <h2 class="runin">What next? </h2>
               <p class="runin" id="p28">
                  <a id="next" name="next" shape="rect"/>In the next part, we'll push a little further forward, getting some code in place to display the messages we've downloaded. We'll also look at the subsequent processing I hinted at. Further down the road, we'll look at search, and then we'll add some <a href="http://en.wikipedia.org/wiki/JavaScript" title="Wikipedia: JavaScript"
                     shape="rect">JavaScript</a>
                  <a href="/knows/what/Javascript" shape="rect">
                     <img border="0" alt="[L]" src="/graphics/linkgroup.gif"/>
                  </a>, refactor
things a bit, and make an AJAXy/Web 2.0 UI for our application.</p>
               <p id="p29">I hope you're enjoying the ride.</p>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>XML+XQuery+Google Voice+Python=WIN!</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/09/01/gvcall"/>
      <id>http://norman.walsh.name/2009/09/01/gvcall</id>
      <published>2009-09-01T13:45:54Z</published>
      <updated>2009-09-01T14:38:35Z</updated>
      <category term="googlevoice" scheme="http://technorati.com/tag/"/>
      <dc:subject>GoogleVoice</dc:subject>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <category term="rdf" scheme="http://technorati.com/tag/"/>
      <dc:subject>RDF</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>It's finally possible to put all the pieces together.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/09/01/gvcall">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>It's finally possible to put all the pieces together.</p>
            </div>
            <p id="p1">I've been storing my <a href="http://en.wikipedia.org/wiki/Personal_information_manager"
                  title="Wikipedia: Personal information manager"
                  shape="rect">PIM</a> data (contacts, appointments, etc.) in XML <a href="http://norman.walsh.name/2005/12/16/pimExample" title="PIM Example"
                  shape="rect">for ages</a> (and I do mean <a href="http://nwalsh.com/docs/presentations/extreme2002/" shape="rect">
                  <em>ages</em>
               </a>!). Since my <a href="http://en.wikipedia.org/wiki/Palm_%28PDA%29"
                  title="Wikipedia: Palm (PDA)"
                  shape="rect">Palm</a> days, I've been
translating whatever native format my PIM supports into XML. Where necessary (i.e. everywhere), I've used an RDF/N3-like annotation mechanism to support additional metadata.</p>
            <p id="p2">For example, the “notes” field for the XProc telcon looks like this:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
rdf:
p:class telcon
p:access public
p:phone #w3c-zakim
p:code 97762#
</pre>
            </div>
            <p id="p3">That gets parsed into the obvious XML by the conversion process:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;p:class&gt;telcon&lt;/p:class&gt;
&lt;p:access&gt;public&lt;/p:access&gt;
&lt;p:phone&gt;#w3c-zakim&lt;/p:phone&gt;
&lt;p:code&gt;97762#&lt;/p:code&gt;
</pre>
            </div>
            <p id="p4">(The phone number “<tt class="literal">#w3c-zakim</tt>” means that there's an entry in my address book with the ID “<tt class="literal">w3c-zakim</tt>”. Why do none of the PIM applications understand that appointments and contacts are related!?)</p>
            <p id="p5">I've been doing this for years <em>because it's The Right Thing™</em>, even though I've only been able to wring small (er, tiny, perhaps miniscule) amounts of practical value from it.</p>
            <p id="p6">But no more!</p>
            <p id="p7">The XML data is stored in my own <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> instance (moving from a collection of <a href="http://en.wikipedia.org/wiki/Perl" title="Wikipedia: Perl" shape="rect">Perl</a> hacks to the server was one of my first personal projects after I joined <a href="http://www.marklogic.com" shape="rect">Mark Logic</a>). I now have a <a href="http://www.google.com/googlevoice/about.html" shape="rect">Google Voice</a> number. And <span class="personname">
                  <span class="firstname">Scott</span> 
                  <span class="surname">Hillman</span>
               </span>’s <a href="http://everydayscripting.blogspot.com/2009/08/python-google-voice-mass-sms-and-mass.html"
                  shape="rect">Python scripts</a> finally let me connect all the dots<sup class="footnote">[<a name="p7.7" href="#ftn.p7.7" id="p7.7" shape="rect">1</a>]</sup>!</p>
            <p id="p9">A little <a href="http://en.wikipedia.org/wiki/Python_%28programming_language%29"
                  title="Wikipedia: Python (programming language)"
                  shape="rect">Python</a> hacking, a quick <a href="http://en.wikipedia.org/wiki/XQuery" title="Wikipedia: XQuery"
                  shape="rect">XQuery</a> module, and I can make calls from a shell window.</p>
            <div class="screen">
               <pre xml:space="preserve">
$ call xproc
W3C XProc WG
+1-617-761-6200 97762#

Dialing +1-617-761-6200...
</pre>
            </div>
            <p id="p10">The call query searches both the appointments on today's calendar and the address book. That means “<strong class="command">call seth</strong>” works equally well, and calls my boss. For contacts with more than one phone number, I can add “<tt class="literal">-<em class="replaceable">
                     <tt class="replaceable">phone</tt>
                  </em>
               </tt>” on the end: “<strong class="command">call ndw@nwalsh.com -home</strong>” would call me at home, should I ever want to do that.</p>
            <p id="p11">It's a tiny little thing, but it feels <em>great</em>.</p>
            <p id="p12">I'm easily amused, I know.</p>
            <p id="p13">Hmm. I should add an option to send SMS messages, too…</p>
            <div class="footnotes">
               <hr width="100" align="left" class="footnotes-divider"/>
               <div class="footnote">
                  <p id="p8">
                     <sup>[<a href="#p7.7" name="ftn.p7.7" id="ftn.p7.7" shape="rect">1</a>]</sup>Well, technically, reconnect, but for all it's coolness, I never got much use out of <a href="http://norman.walsh.name/2003/11/05/tel" title="Automatic Dialing"
                        shape="rect">the DTMF auto dialer</a>.</p>
               </div>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Micro-blogging Backup, part the second</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/08/28/mbb02"/>
      <id>http://norman.walsh.name/2009/08/28/mbb02</id>
      <published>2009-08-28T18:16:44Z</published>
      <updated>2009-08-28T19:50:25Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <category term="microblogging" scheme="http://technorati.com/tag/"/>
      <dc:subject>Microblogging</dc:subject>
      <category term="www" scheme="http://technorati.com/tag/"/>
      <dc:subject>TheWeb</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>In which we setup the database one screen at a time and then import our first status messages.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/08/28/mbb02">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>In which we setup the database one screen at a time and then import our first status messages.</p>
            </div>
            <p id="p1">If you were following along <a href="http://norman.walsh.name/2009/08/27/mbb01"
                  title="Micro-blogging Backup, part the first"
                  shape="rect">yesterday</a>, you've got <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> up and running with the <a href="http://developer.marklogic.com/about/whatiscis.xqy#community"
                  shape="rect">Community License</a>. Now it's time to start putting it to work. (Cutting toothpicks with a chainsaw, but hey, you have to start somewhere.)</p>
            <p id="p2">Well, almost. First, we have to do a little setup.</p>
            <p id="p3">Download <a href="examples/mbb02.zip" shape="rect">mbb02.zip</a> and unpack it somewhere convenient. I choose <tt class="filename">/home/ndw/mbb</tt> for the purposes of this example, but I'm not sure your home directory is really the best place. Anywhere you'd like though, doesn't matter to me.</p>
            <p id="p4">Fire up your favorite web browser and connect to the admin interface on port 8001 (<a href="http://localhost:8001/" shape="rect">http://localhost:8001/</a>, probably); you'll need to login with whatever userid/password combination you selected at installation time.</p>
            <p id="p5">Once you're there, click “Forests” in the “Configure” tree control in the left hand column and then select the “Create” tab. Enter any name you'd like for the forest and click “ok”. I named mine “mbb”.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3865078512/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm4.static.flickr.com/3186/3865078512_ee28a5cdb4.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Create a forest</h3>
               </div>
            </div>
            <p id="p6">Forests are where the server stores XML documents. Trees, as it were. Clever, eh?</p>
            <p id="p7">Next, choose “Databases” in the tree control and select the “Create” tab again. Enter any name you'd like for the database and click “ok”. I named mine “mbb”. I can't think of a compelling reason to give them different names, but suit yourself.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3864296197/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2666/3864296197_8b9abcd82d.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Create a database</h3>
               </div>
            </div>
            <p id="p8">Once you've created a database, you'll be reminded that you need to attach a forest to the database.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3865078752/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2476/3865078752_c0be6ca29b.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>You must attach a forest to the database</h3>
               </div>
            </div>
            <p id="p9">Click on that link and do so. Remember to click “ok”.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3864296397/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2576/3864296397_fc437a339d.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Attach the forest you created</h3>
               </div>
            </div>
            <p id="p10">Almost there. Choose “Groups”, “Default”, and “App Servers” in the tree control, then select the “Create HTTP” tab. Enter any name you'd like for the server name, I named mine “mbb”; enter the location where you unpacked the zip file for the root, I used <tt class="filename">/home/ndw/mbb</tt>; and enter an open port value for the port, I used “8330”.</p>
            <p id="p11">But <em>don't</em> click “ok” just yet. (If you already did, no worries, just click on the app server's name in the tree control.)</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3865078948/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2533/3865078948_347c5a8d72.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Create an HTTP application server</h3>
               </div>
            </div>
            <p id="p12">Scroll about half way down the page to change the authentication and default user. Select “application level” for the authentication scheme and “admin” for the default user.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3865079022/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2605/3865079022_9250372166.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Change the authentication to application-level</h3>
               </div>
            </div>
            <p id="p13">This gives your application complete access to the server without having to login. There are lots of ways to make an application more secure, but let's leave all the security knobs for another day. Now scroll to the top or bottom and click “ok”.</p>
            <p id="p14">At this point, you have a real honest-to-goodness application running on your server. (And yeah, this should all be simpler and easier. I've heard tell of plans to improve it, but nothing I can swear to.)</p>
            <p id="p15">I included a copy of “CQ”, a browser-based, interactive XQuery environment in the distribution. You can see it if you navigate your browser to <a href="http://localhost:8330/cq" shape="rect">http://localhost:8330/cq</a>. (In this and all the following examples, if you chose a different port, use the port number you chose.)</p>
            <p id="p16">If you click on the “explore” link at the top of the CQ page, you'll see that you've got an empty database.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3865079102/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm4.static.flickr.com/3503/3865079102_6394379522.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>CQ shows the empty database</h3>
               </div>
            </div>
            <p id="p17">Now it's time to configure this particular database for our micro-blogging backup application. Later on, we're going to need some indexes. You could walk through the admin UI to create them, but that's tedious, you only have to do this once, and the admin UI is completely scriptable, so I created a little query to do the grunt work.</p>
            <p id="p18">Point your web browser at the database configuration script: <a href="http://localhost:8330/init/setup-database.xqy" shape="rect">http://localhost:8330/init/setup-database.xqy</a>. If everything is setup correctly, you'll quickly get a “database configured” message.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3865079144/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2467/3865079144_9dabcc9ed1.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Configure the database</h3>
               </div>
            </div>
            <p id="p19">Next, we need to configure the microblogging accounts that you want to backup. Like database configuration, you're probably only going to do this once (or at least once in a great while), so I didn't create any sort of UI for it.</p>
            <p id="p20">In the directory where you unpacked <tt class="filename">mbb02.zip</tt>, open up <tt class="filename">init/setup-accounts.xqy</tt> with your favorite text editor. On lines 57 and 58 replace <tt class="literal">SCREEN_NAME</tt> and <tt class="literal">PASSWORD</tt> with the <a href="http://twitter.com/" shape="rect">Twitter</a> username and password that you want to backup.</p>
            <p id="p21">If you're using <a href="http://identi.ca/" shape="rect">Identi.ca</a> instead, you'll have to do a little more editing, but it should be pretty straightfoward. If you're using your own install of the Laconica software, or you're using some other microblogging server, as long as it supports the Twitter API, you should be able to figure out what to do. Feel free to ask if you're not sure.</p>
            <p id="p22">When you've got all your accounts in place, save the file and point your web browser at it: <a href="http://localhost:8330/init/setup-accounts.xqy" shape="rect">http://localhost:8330/init/setup-accounts.xqy</a>.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3865079190/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm4.static.flickr.com/3493/3865079190_077522b8dc.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Configure your accounts</h3>
               </div>
            </div>
            <p id="p23">If all goes well, you'll get an appropriate “Accounts initialized” message. If you get 500 errors, you messed up the XQuery syntax somewhere. It won't do any harm to run the setup account script more than once, so try making small changes, running the script after each change. If you get stuck, let me know.</p>
            <p id="p24">If you go back to CQ again and click the “explore” link, you'll see that there are documents in the database now, one for each account you added.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3864296927/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm4.static.flickr.com/3452/3864296927_3f17eaca53.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>CQ shows the database with one document</h3>
               </div>
            </div>
            <p id="p25">Now we're ready to <em>really</em> do something.</p>
            <p id="p26">Point your web browser at <a href="http://localhost:8330/get-tweets.xqy" shape="rect">http://localhost:8330/get-tweets.xqy</a> to download your status messages. This may take a while, especially the first time and especially if you entered several accounts.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3864297111/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm4.static.flickr.com/3239/3864297111_18b9b23031.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Download the status messages for your account(s)</h3>
               </div>
            </div>
            <p id="p27">If you get a message about “rate limit exceeded”, it means you've done too many interactions with the Twitter API this hour. Wait a bit and try again. Twitter threatens that they'll turn off your account if you flagrantly violate the rate limit, so the MBB queries are pretty careful not to.</p>
            <p id="p28">The “explore” link in CQ will now show a whole bunch of documents in the database.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3864297285/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm3.static.flickr.com/2643/3864297285_3d60b8eba4.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>CQ shows a database full of documents</h3>
               </div>
            </div>
            <p id="p29">You can enter any arbitrary XQuery expressions you'd like into CQ. Here I've asked for a count of all the messages that I've “favorited”.</p>
            <div class="artwork">
               <div class="flickr-photo">
                  <div class="photo" style="width: 500px">
                     <a href="http://www.flickr.com/photos/ndw/3864297405/" shape="rect">
                        <img border="0" alt="[Photo]"
                             src="http://farm4.static.flickr.com/3251/3864297405_f7c7e9ea74.jpg"/>
                     </a>
                  </div>
                  <div class="link" style="left: 225px;">
                     <a href="http://www.flickr.com/" shape="rect">
                        <img border="0" alt="[Flickr]" src="/graphics/flickrt.png"/>
                     </a>
                  </div>
                  <h3>Arbitrary XQuery expressions evaluated by CQ</h3>
               </div>
            </div>
            <p id="p30">In the next parts, we'll look at some of the code behind this functionality in a little more detail, add some XQuery to display the messages, look at how we can augment the messages in useful ways, add searching, and finally pull the pieces together into a useful little app. Well, a little app I think is useful, anyway.</p>
            <div class="section">
               <h2 class="runin">What about my older messages? </h2>
               <p class="runin" id="p31">
                  <a id="old" name="old" shape="rect"/>Twitter only lets you get at the last 3,200 or so status messages with the Twitter API. If you've got older status messages that you've already backed up, or if you can find some other API to get at them, there are other ways to get them in the database.</p>
               <p id="p32">I left the skeleton of one of those ways in the <tt class="filename">init</tt> directory, an XProc pipeline that <a href="http://xmlcalabash.com/" shape="rect">XML Calabash</a> can run to load status messages from existing XML files.</p>
               <p id="p33">If you've got your old tweets archived in XML, drop me a line and I'll try to point you in the right direction.</p>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Micro-blogging Backup, part the first</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/08/27/mbb01"/>
      <id>http://norman.walsh.name/2009/08/27/mbb01</id>
      <published>2009-08-27T13:23:47Z</published>
      <updated>2009-08-27T16:02:44Z</updated>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <category term="microblogging" scheme="http://technorati.com/tag/"/>
      <dc:subject>Microblogging</dc:subject>
      <category term="www" scheme="http://technorati.com/tag/"/>
      <dc:subject>TheWeb</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>What started out as a trivial exercise in backing up my Twitter and Identi.ca posts turned into a little microcosm of XML Server application development. It's something you can deploy for free on your very own MarkLogic Server!</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/08/27/mbb01">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>What started out as a trivial exercise in backing up my Twitter and Identi.ca posts turned into a little microcosm of XML Server application development. It's something you can deploy for free on your very own MarkLogic Server!</p>
            </div>
            <p id="p1">This is the story of the intersection of two ideas:</p>
            <div class="orderedlist">
               <ol style="list-style: decimal;">
                  <li>
                     <p id="p2">First, almost no one that I spoke to at <a href="http://balisage.net/" shape="rect">Balisage</a> had heard of the <a href="http://developer.marklogic.com/about/whatiscis.xqy#community"
                           shape="rect">Community License</a> for <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a>, and those few who had thought that it was still limited to just 100Mb of content.</p>
                     <p id="p3">The fact that you can download and play with the best XML server on the planet is something more people should know about! The community license is for non-commercial use only but it's free and it never expires. The previous 100Mb content limit has been upped to 10Gb so there's a lot more room in the sandbox now.</p>
                  </li>
                  <li>
                     <p id="p4">Second, at about the same time, there was a little spike of interest in backing up microblogging data, the status messages that you send to services like <a href="http://twitter.com/" shape="rect">Twitter</a> or <a href="http://identi.ca/" shape="rect">Identi.ca</a>.</p>
                     <p id="p5">Sturgeon's law applies, of course, to microblogging. And Sturgeon was an optimist. But there's still a lot of useful information out there and I don't want it to disappear under the waves just because some acquisition occurs and the terms of service shift under my feet.</p>
                  </li>
               </ol>
            </div>
            <p id="p6">Luckily, the APIs for getting your microblogging content return XML and I have an XML server, so… my first thought was to download the tweets (a “<strong class="command">for</strong>” loop in <a href="http://en.wikipedia.org/wiki/Bash" title="Wikipedia: Bash" shape="rect">Bash</a> and <a href="http://en.wikipedia.org/wiki/Wget" title="Wikipedia: Wget" shape="rect">wget</a> will do the trick) and store them in the server. Then I thought, that's silly, the server can download them for me…</p>
            <p id="p7">From there, my little ten minute exercise grew until I had a (still relatively small) appication that handles oodles of documents from multiple services and accounts, has threaded conversations and account merging, uses indexes, has full-text and faceted search, employs web APIs, uses URI rewriting, and even has some AJAX.</p>
            <p id="p8">And because status messages are small, it'll run for <em>ages</em> under the community license.</p>
            <p id="p9">My plan, therefore, is to spin this out over a few essays, building the app from its barest bones to something I'm finding quite useful. If you want to play along, the first step is to go get a copy of MarkLogic Server and install it with the community license. The steps are roughly these:</p>
            <div class="orderedlist">
               <ol style="list-style: decimal;">
                  <li>
                     <p id="p10">
                        <a href="http://dev.marklogic.com/download/" shape="rect">Download</a> version 4.1 of the server. It runs on Windows, Linux, and Sparc boxes. (It isn't, alas, available for <a href="http://en.wikipedia.org/wiki/Mac_OS_X" title="Wikipedia: Mac OS X"
                           shape="rect">OS X</a>, but it runs just fine under virtualization.)</p>
                     <p id="p11">It also <a href="http://strangelylooping.wordpress.com/2009/06/14/marklogic-server-on-ubuntu-9-04/"
                           shape="rect">runs just fine</a> on the <a href="http://en.wikipedia.org/wiki/Debian" title="Wikipedia: Debian"
                           shape="rect">Debian</a> flavors of <a href="http://en.wikipedia.org/wiki/Linux" title="Wikipedia: Linux" shape="rect">Linux</a>, though that's not an officially supported platform. Just make sure you have the <a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=519817" shape="rect">bugfixed version</a> of <span class="package">lsb-base</span>.</p>
                  </li>
                  <li>
                     <p id="p12">After it's installed and running, point your web browser at <tt class="uri">http://localhost:8001/</tt> on the machine where you installed it.</p>
                  </li>
                  <li>
                     <p id="p13">Click on the “free” license button, choose the community license, and click your way through the rest of the install screens.</p>
                  </li>
               </ol>
            </div>
            <p id="p14">Congratulations! You know have the most powerful XML chainsaw imaginable at your fingertips. Exactly what to do with it is the subject of part the second and beyond.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Using XML Catalogs and XProc together</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/07/22/xmlCatalogsandXProc"/>
      <id>http://norman.walsh.name/2009/07/22/xmlCatalogsandXProc</id>
      <published>2009-07-22T20:15:27Z</published>
      <updated>2009-07-22T20:51:31Z</updated>
      <category term="calabash" scheme="http://technorati.com/tag/"/>
      <dc:subject>Calabash</dc:subject>
      <category term="xmlcatalogs" scheme="http://technorati.com/tag/"/>
      <dc:subject>XMLCatalogs</dc:subject>
      <category term="xproc" scheme="http://technorati.com/tag/"/>
      <dc:subject>XProc</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>XML Calabash, my implementation of XProc, is my go-to tool these days for manipulating XML documents. Adding XML Catalogs into the mix just makes it sweeter.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/07/22/xmlCatalogsandXProc">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>XML Calabash, my implementation of XProc, is my go-to tool these days for manipulating XML documents. Adding XML Catalogs into the mix just makes it sweeter.</p>
            </div>
            <p id="p1">Recently, I was presented with several hundred books comprised of many thousands of chapters. My goal: load them into the server so that they could become part of a larger application. Easy peasy.</p>
            <p id="p2">Two snags: all the chapters contained references to named entities declared in an external subset and none of the metadata in each file was actually reliable.</p>
            <p id="p3">Still pretty straight-forward. Parse the document to expand the entity references, do a little cleanup, and push them into the database. The details of the pipeline aren't that important, the bit I want to highlight today is the parsing.</p>
            <p id="p4">Everything remained pretty easy until I discovered that there were a half-dozen or more flavors of DTD in use across this corpus. And naturally, every external subset was referenced <em>only</em> by a system identifier with some random, absolute path:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;!DOCTYPE chapter SYSTEM "/path/to/dtd10.dtd"&gt;
</pre>
            </div>
            <p id="p5">Where “10” was “10”, “11”, “21”, “25”, etc. for some substantial enough set of versions to be go well beyond my limit for tedium.</p>
            <p id="p6">Luckily, all of them were including a standard suite of ISO entities and (as far as I could easily tell), that's all the entity references ever were.</p>
            <p id="p7">XML Catalogs to the rescue.</p>
            <p id="p8">First, grab a recent version of the DTD and stick it somewhere local, then construct the following catalog:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
  &lt;systemSuffix systemIdSuffix=".dtd" uri="local/dtd21.dtd"/&gt;
&lt;/catalog&gt;
</pre>
            </div>
            <p id="p9">Next, tell <a href="http://norman.walsh.name/2008/projects/calabash"
                  title="XML Calabash: an XProc implementation"
                  shape="rect">XML Calabash</a> to use catalogs. You can do this from the command line, but I set it up in my configuration file, <tt class="filename">~/.calabash</tt>:</p>
            <div class="programlisting">
               <pre xml:space="preserve">
&lt;cc:xproc-config xmlns:cc="http://xmlcalabash.com/ns/configuration"&gt;
  &lt;cc:schema-aware&gt;false&lt;/cc:schema-aware&gt;
  &lt;cc:log-level level="warning"/&gt;
  &lt;cc:serialization
      omit-xml-declaration="false"/&gt;
  &lt;cc:entity-resolver class-name="org.xmlresolver.Resolver"/&gt; 
  &lt;cc:uri-resolver class-name="org.xmlresolver.Resolver"/&gt;
&lt;/cc:xproc-config&gt;
</pre>
            </div>
            <p id="p10">The first few lines just set some defaults I like, it's the last two that are relevant here. I tell <em class="citetitle">XML Calabash</em> to use my <a href="http://xmlresolver.org/" shape="rect">XML Resolver</a> catalog implementation for entity and URI resolution.</p>
            <p id="p11">Now my pipeline simply does The Right Thing™.</p>
            <p id="p12">When the parser attempts to load the external subset, the catalog resolver returns the local DTD (because all the system identifiers end with “<tt class="literal">.dtd</tt>”). The <tt class="tag-starttag">&lt;p:load&gt;</tt> step doesn't do validation by default, so the fact that some of the files aren't valid according to the particular version of the DTD that I have locally doesn't matter. The entities get expanded correctly. (If any of the documents had relied on other entities only present
in a particular version of the DTD, that would have been an error, so I know I didn't miss any.) I do a couple of lightweight transformations on the resulting document and shove it into the database FTW!</p>
            <p id="p13">Nothing earth shattering here, and not the only way to solve the problem, but one that looks like a nail to my particular hammer of choice at the moment.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Not exactly XProc</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/06/23/notXProc"/>
      <id>http://norman.walsh.name/2009/06/23/notXProc</id>
      <published>2009-06-23T22:27:55Z</published>
      <updated>2009-06-24T13:01:22Z</updated>
      <category term="calabash" scheme="http://technorati.com/tag/"/>
      <dc:subject>Calabash</dc:subject>
      <category term="w3c" scheme="http://technorati.com/tag/"/>
      <dc:subject>W3C</dc:subject>
      <category term="xproc" scheme="http://technorati.com/tag/"/>
      <dc:subject>XProc</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>One advantage of being an implementor is that I can play with languages that the Working Group didn't approve.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/06/23/notXProc">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>One advantage of being an implementor is that I can play with languages that the Working Group didn't approve.</p>
            </div>
            <p id="p1">I've implemented a number of <a href="http://en.wikipedia.org/wiki/XML_pipeline"
                  title="Wikipedia: XML pipeline"
                  shape="rect">XProc</a>
               <a href="/knows/what/xproc" shape="rect">
                  <img border="0" alt="[L]" src="/graphics/linkgroup.gif"/>
               </a> extensions, and have plans for at least a <a href="http://markmail.org/message/s7puhxcez2tmmq4f" shape="rect">few</a> 
               <a href="http://markmail.org/message/jnsqrrv5xazervre" shape="rect">more</a>, but so far they've all used standard extension mechanisms.</p>
            <p id="p2">On the train ride home Monday night, I decided to do something different. Implementor's prerogative.</p>
            <p id="p3">The <a href="http://www.w3.org/TR/xproc/" shape="rect">XProc</a> specification states that all variables, options, and parameters are string values. On the whole, I think this is a useful simplification:</p>
            <div class="itemizedlist">
               <ul>
                  <li>
                     <p id="p4">All of the options used by the standard atomic steps have convenient string representations: they don't need more complex structures.</p>
                  </li>
                  <li>
                     <p id="p5">In an XPath 1.0 implementation there are only a few data types anyway (remember, there was a time when we thought we might finish before the XSLT/XQuery WGs). [Ah, optimism! -ed ]</p>
                  </li>
                  <li>
                     <p id="p6">Using strings simplifies serialization issues for steps like <tt class="tag-starttag">&lt;p:parameters&gt;</tt>.</p>
                  </li>
               </ul>
            </div>
            <p id="p7">But it's frustrating in one particular area, XSLT parameters and XQuery external variables can have more complex values. The fact that XProc doesn't support this means that there are some stylesheets and queries that can't be fully supported by XProc.</p>
            <p id="p8">Early on, I proposed that we allow parameters at least to contain either strings or documents, but I couldn't get working group support for the idea. (I think they'll come around, but not in 1.0.)</p>
            <p id="p9">I've wondered, ever since my idea got left on the cutting room floor, how hard it would be to support arbitrary <a href="http://www.w3.org/TR/xpath-datamodel/" shape="rect">XDM</a> values in XProc.</p>
            <p id="p10">So I implemented it.</p>
            <p id="p11">Turns out it's not very hard at all. I extended the <tt class="classname">RuntimeValue</tt> object to preserve the original XDM value of the expression instead of discarding it after computing its string value. In <tt class="tag-starttag">&lt;p:xslt&gt;</tt> and <tt class="tag-starttag">&lt;p:xquery&gt;</tt>, instead of using the string value for parameters and external variables, respectively, I use the XDM value. Everywhere else, I continue to use the string value so this change has no impact
on other atomic steps.</p>
            <p id="p12">In compound steps, I made a change analagous to the changes for <tt class="tag-starttag">&lt;p:xslt&gt;</tt> and <tt class="tag-starttag">&lt;p:xquery&gt;</tt>, when setting up the environment for evaluating XPath expressions, I use the XDM values of options and variables instead of the string values. This means that user-defined pipelines can accept and use XDM values.</p>
            <p id="p13">The hardest part, by far, was changing the <tt class="tag-starttag">&lt;p:parameters&gt;</tt> step and the interpretation of <tt class="tag-starttag">&lt;c:parameter-set&gt;</tt> documents to support an extended serialization for arbitrary XDM values.</p>
            <p id="p14">All of which means that you can do things like this:</p>
            <div class="informalexample-wrapper">
               <div class="informalexample">
                  <div class="programlisting">
                     <pre xml:space="preserve">
&lt;p:declare-step name="main"
                xmlns:p="http://www.w3.org/ns/xproc"
                xmlns:cx="http://xmlcalabash.com/ns/extensions"&gt;
&lt;p:output port="result"/&gt;
&lt;p:serialization port="result" indent="true"/&gt;

&lt;p:input port="config" primary="false"&gt;
  &lt;p:inline&gt;
    &lt;config&gt;
      &lt;name&gt;value&lt;/name&gt;
      &lt;name2&gt;value2&lt;/name2&gt;
      &lt;fragment&gt;
        &lt;doc&gt;
          &lt;p&gt;Some fragment. How doc/p is useful
          in a configuration file, I don't know.
          &lt;/p&gt;
        &lt;/doc&gt;
      &lt;/fragment&gt;
    &lt;/config&gt;
  &lt;/p:inline&gt;
&lt;/p:input&gt;

&lt;p:declare-step type="cx:foo"&gt;
  &lt;p:output port="result"/&gt;

  &lt;!-- This is silly, never do this. --&gt;
  <a name="seq1" id="seq1" shape="rect"/><img alt="1" border="0" src="/graphics/callouts/1.png"/>&lt;p:option name="param-seq" required="true"/&gt;

  &lt;p:xslt template-name="cx:main"&gt;
    &lt;p:input port="source"&gt;
      &lt;p:empty/&gt;
    &lt;/p:input&gt;
    &lt;p:input port="stylesheet"&gt;
      &lt;p:inline&gt;
        &lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                        version="2.0"&gt;

          &lt;xsl:param name="name"/&gt;
          &lt;xsl:param name="name2"/&gt;
          &lt;xsl:param name="fragment"/&gt;

          &lt;xsl:template name="cx:main"&gt;
            &lt;cx:doc&gt;
              &lt;name&gt;&lt;xsl:copy-of select="$name"/&gt;&lt;/name&gt;
              &lt;name2&gt;&lt;xsl:copy-of select="$name2"/&gt;&lt;/name2&gt;
              &lt;frag&gt;&lt;xsl:copy-of select="$fragment"/&gt;&lt;/frag&gt;
            &lt;/cx:doc&gt;
          &lt;/xsl:template&gt;
        &lt;/xsl:stylesheet&gt;
      &lt;/p:inline&gt;
    &lt;/p:input&gt;
    &lt;p:input port="parameters"&gt;
      &lt;p:empty/&gt;
    &lt;/p:input&gt;
    &lt;p:with-param name="name" select="$param-seq[1]"&gt;<a name="seq2" id="seq2" shape="rect"/><img alt="2" border="0" src="/graphics/callouts/2.png"/>
      &lt;p:empty/&gt;
    &lt;/p:with-param&gt;
    &lt;p:with-param name="name2" select="$param-seq[2]"&gt;
      &lt;p:empty/&gt;
    &lt;/p:with-param&gt;
    &lt;p:with-param name="fragment" select="$param-seq[3]"&gt;
      &lt;p:empty/&gt;
    &lt;/p:with-param&gt;
  &lt;/p:xslt&gt;
&lt;/p:declare-step&gt;

&lt;p:variable name="cfg1" select="/config/name"&gt;<a name="node1" id="node1" shape="rect"/><img alt="3" border="0" src="/graphics/callouts/3.png"/>
  &lt;p:pipe step="main" port="config"/&gt;
&lt;/p:variable&gt;

&lt;p:variable name="cfg2" select="string(/config/name2)"&gt;<a name="string" id="string" shape="rect"/><img alt="4" border="0" src="/graphics/callouts/4.png"/>
  &lt;p:pipe step="main" port="config"/&gt;
&lt;/p:variable&gt;

&lt;p:variable name="cfgfrag" select="/config/fragment/*"&gt;<a name="node2" id="node2" shape="rect"/><img alt="5" border="0" src="/graphics/callouts/5.png"/>
  &lt;p:pipe step="main" port="config"/&gt;
&lt;/p:variable&gt;

&lt;cx:foo&gt;
  &lt;p:with-option name="param-seq"
                 select="($cfg1,$cfg2,$cfgfrag)"&gt;<a name="seq3" id="seq3" shape="rect"/><img alt="6" border="0" src="/graphics/callouts/6.png"/>
    &lt;p:empty/&gt;
  &lt;/p:with-option&gt;
&lt;/cx:foo&gt;

&lt;/p:declare-step&gt;
</pre>
                  </div>
               </div>
            </div>
            <p id="p15">The <tt class="option">param-seq</tt> option<a name="seq1" id="seq1" shape="rect"/>
               <img alt="1" border="0" src="/graphics/callouts/1.png"/> of our user-defined <tt class="literal">cx:foo</tt> step expects a sequence (even though this is silly thing to do in this case).</p>
            <p id="p16">We extract items from this sequence<a name="seq2" id="seq2" shape="rect"/>
               <img alt="2" border="0" src="/graphics/callouts/2.png"/> to establish the values of the stylesheet parameters.</p>
            <p id="p17">Back out in our main pipeline, we extract values from the configuration file and store them in variables. (We don't have to do this, of course, we could have computed the sequence directly with XPath expressions.)</p>
            <p id="p18">Pay particular attention to the first value<a name="node1" id="node1" shape="rect"/>
               <img alt="3" border="0" src="/graphics/callouts/3.png"/>. This XPath expression selects a node; in standard XProc, this would automatically become a string. Using the general values extension, this will remain a node, which may not be what was intended.</p>
            <p id="p19">The second value<a name="string" id="string" shape="rect"/>
               <img alt="4" border="0" src="/graphics/callouts/4.png"/> uses <tt class="function">string()</tt> to explicitly make the parameter into a string. The third example<a name="node2" id="node2" shape="rect"/>
               <img alt="5" border="0" src="/graphics/callouts/5.png"/> also selects a node.</p>
            <p id="p20">Finally, we pass all of these values to the <tt class="literal">cx:foo</tt> step as a sequence<a name="seq3" id="seq3" shape="rect"/>
               <img alt="6" border="0" src="/graphics/callouts/6.png"/>. In standard XProc, this sequence would be collapsed into a single string value, but it will remain a sequence if we use the general values extension.</p>
            <p id="p21">Run through a standard XProc processor, here is the expected result:</p>
            <div class="informalexample-wrapper">
               <div class="informalexample">
                  <div class="programlisting">
                     <pre xml:space="preserve">
&lt;cx:doc xmlns:cx="http://xmlcalabash.com/ns/extensions"&gt;
   &lt;name&gt;valuevalue2
          Some fragment. How doc/p is useful
          in a configuration file, I don't know.
          
        &lt;/name&gt;
   &lt;name2/&gt;
   &lt;frag/&gt;
&lt;/cx:doc&gt;
</pre>
                  </div>
               </div>
            </div>
            <p id="p22">We get the string value of all the variables, options, and parameters with the <tt class="option">param-seq</tt> option compressed to a single string value.</p>
            <p id="p23">But if we enable the general values extension (with <tt class="literal">-X general-values</tt> on the command line with <a href="http://norman.walsh.name/2008/projects/calabash"
                  title="XML Calabash: an XProc implementation"
                  shape="rect">XML Calabash</a> version 0.9.<em>12</em>), we get a different result:</p>
            <div class="informalexample-wrapper">
               <div class="informalexample">
                  <div class="programlisting">
                     <pre xml:space="preserve">
&lt;cx:doc xmlns:cx="http://xmlcalabash.com/ns/extensions"&gt;
   &lt;name&gt;
      &lt;name&gt;value&lt;/name&gt;
   &lt;/name&gt;
   &lt;name2&gt;value2&lt;/name2&gt;
   &lt;frag&gt;
      &lt;doc&gt;
                &lt;p&gt;Some fragment. How doc/p is useful
          in a configuration file, I don't know.
          &lt;/p&gt;
             &lt;/doc&gt;
   &lt;/frag&gt;
&lt;/cx:doc&gt;
</pre>
                  </div>
               </div>
            </div>
            <p id="p24">Here our sequence has been passed successfully and each of the individual values has been preserved all the way through to XSLT.</p>
            <div class="admonition">
               <table border="0" cellspacing="0" cellpadding="4"
                      summary="Presentation of a admonition">
                  <tbody>
                     <tr>
                        <td valign="top" rowspan="1" colspan="1">
                           <span class="admon-graphic">
                              <img alt="Important" src="/graphics/important.png"/>
                           </span>
                        </td>
                        <td rowspan="1" colspan="1">
                           <div class="admon-text">
                              <p id="p25">With the general values extension, XML Calabash <em>does not</em> implement XProc 1.0! It implements a closely related, but entirely non-standard language which you cannot expect to interoperate with other implementations.</p>
                           </div>
                        </td>
                     </tr>
                  </tbody>
               </table>
            </div>
            <p id="p26">There are still a few obvious weaknesses in this extension.</p>
            <div class="orderedlist">
               <ol style="list-style: decimal;">
                  <li>
                     <p id="p27">Implementing a non-standard extension is a bad thing. I probably should disable it completely.</p>
                  </li>
                  <li>
                     <p id="p28">There should be a mechanism (an <tt class="tag-attribute">as</tt> attribute, probably) to selectively enable this behavior. This would also allow for type-checking the values passed around.</p>
                  </li>
                  <li>
                     <p id="p29">The serialization used by <tt class="tag-starttag">&lt;p:parameters&gt;</tt> is incompletely supported. Although the serialization identifies the type of atomic values, the code which interprets this serialization ignores the types. Integers may go in, but strings come out.</p>
                  </li>
               </ol>
            </div>
            <p id="p30">This is an experimental feature. It may or may not survive over the long run. Comments most welcome.</p>
            <p id="p31">Remember: if you enable this extension, you are not running a conformant XProc processor. Your gun, your bullet, your foot.</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Java vs. AJAX</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/06/10/ajax"/>
      <id>http://norman.walsh.name/2009/06/10/ajax</id>
      <published>2009-06-10T20:22:10Z</published>
      <updated>2009-06-10T20:42:17Z</updated>
      <category term="ajax" scheme="http://technorati.com/tag/"/>
      <dc:subject>Ajax</dc:subject>
      <category term="java" scheme="http://technorati.com/tag/"/>
      <dc:subject>Java</dc:subject>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Watching the twitter stream from JavaOne go by, I was initially surprised by the apparent frontal assault on AJAX. It seemed like an odd target at first; on further reflection, not so much.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/06/10/ajax">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Watching the twitter stream from JavaOne go by, I was initially surprised by the apparent frontal assault on AJAX. It seemed like an odd target at first; on further reflection, not so much.</p>
            </div>
            <p id="p1">Pointing <a href="http://www.theregister.co.uk/2009/06/02/ellison_oracle_javafx/"
                  shape="rect">the big guns</a> of JavaOne at <a href="http://en.wikipedia.org/wiki/Ajax_%28programming%29"
                  title="Wikipedia: Ajax (programming)"
                  shape="rect">AJAX</a> seemed very strange when I first saw it go by. A bit like attacking tractors or escalators: it's a bit of infrastructure, a programming technique, not a competing vendor or product. But later, the possible significance dawned on me (maybe I'm slow on the uptake and this is
obvious to everyone else, I dunno).</p>
            <p id="p2">AJAX is a useful technique for building rich, information-centric applications, the sorts of applications that help readers get the most value from what they see and help companies get the most value out of the content they have.</p>
            <p id="p3">As it turns out, these are exactly the sorts of applications that are easy to build on top of <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a> (or, more broadly, with similar technologies; see also <a href="http://en.wikipedia.org/wiki/XRX_%28web_application_architecture%29"
                  title="Wikipedia: XRX (web application architecture)"
                  shape="rect">XRX</a>). The application runs in the browser (standard HTML+CSS+JavaScript) and uses AJAX (with XML or <a href="http://en.wikipedia.org/wiki/JSON" title="Wikipedia: JSON" shape="rect">JSON</a>, that's not really significant for the point I want to make) to provide a dynamic, interactive experience for the user. The server can very quickly search through large amounts of content and transform what it finds back into the nuggets needed to satisfy the AJAX requests. The result: easy to build, easy to deploy, responsive applications that make users happy: win, win, win.</p>
            <p id="p4">What's significant about this isn't how efficient it is, or how agile, what's significant is what's missing: there's no relational database (because that's a dumb way to store the rich content that drives applications like this) and there's no Java stack (because, well, because you just don't need anything as awkward and complex as a Java <a href="http://en.wikipedia.org/wiki/Application_server"
                  title="Wikipedia: Application server"
                  shape="rect">application server</a> to get the job done<sup class="footnote">[<a name="p4.2" href="#ftn.p4.2" id="p4.2" shape="rect">1</a>]</sup>).</p>
            <p id="p6">So, just maybe, AJAX is a kind of competitor to Oracle+Java. Credit where it's due, I never would have seen that.</p>
            <p id="p7">[ Yes, I'm coming late to the party. I put this in the “wait 24 hours” bucket and then forgot about it for a few days. –ed ]</p>
            <div class="footnotes">
               <hr width="100" align="left" class="footnotes-divider"/>
               <div class="footnote">
                  <p id="p5">
                     <sup>[<a href="#p4.2" name="ftn.p4.2" id="ftn.p4.2" shape="rect">1</a>]</sup>Java fans: please put down the flame throwers. I like Java. I'm a Java fan. It's my go to language for building applications (like <a href="http://norman.walsh.name/2008/projects/calabash"
                        title="XML Calabash: an XProc implementation"
                        shape="rect">XML Calabash</a>). I just never drank the “application server” kool aid. If that was the answer, I always used to think, you must be asking the wrong question.</p>
               </div>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Building my own…</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/06/03/buildingMyOwn"/>
      <id>http://norman.walsh.name/2009/06/03/buildingMyOwn</id>
      <published>2009-06-03T16:55:11Z</published>
      <updated>2009-06-03T17:22:54Z</updated>
      <dc:subject>SelfReference</dc:subject>
      <category term="solaris" scheme="http://technorati.com/tag/"/>
      <dc:subject>Solaris</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>I've always wanted to pick out the components and build my own computer. Maybe the time has come.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/06/03/buildingMyOwn">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>I've always wanted to pick out the components and build my own computer. Maybe the time has come.</p>
            </div>
            <p id="p1">I've always wanted to build my own computer, but I've never done it. For years, my primary computer has been a laptop and it's not really practical to home-build one of those. For servers, I've just muddled along with spare boxes and cast-offs. Extra disk and memory in almost anything running <a href="http://en.wikipedia.org/wiki/Linux" title="Wikipedia: Linux" shape="rect">Linux</a> will get you a long way.</p>
            <p id="p2">My current cobbled together setup involves a slow, noisy desktop box in a closet and a handful of USB enclosures on my desk. Not making me happy. So what do I want?</p>
            <div class="orderedlist">
               <ol style="list-style: decimal;">
                  <li>
                     <p id="p3">I'd like a walloping big slug of disk space, in some sort of redundant/RAID configuration for storage and backups.</p>
                  </li>
                  <li>
                     <p id="p4">I'd like the system to function as a media server (for some vague definition of what media server means). It'd be nice if I could stream music to my laptop and video to the PS/3 in the living room.</p>
                  </li>
                  <li>
                     <p id="p5">I'd like it to be able to build applications on it. That is, it should run a web server and ideally <a href="http://www.marklogic.com/product/marklogic-server.html" shape="rect">MarkLogic Server</a>.</p>
                  </li>
                  <li>
                     <p id="p6">I'd like it to <em>be quiet</em> and consume only reasonable amounts of power.</p>
                  </li>
                  <li>
                     <p id="p7">Naturally, I'd like it to be not too expensive.</p>
                  </li>
               </ol>
            </div>
            <p id="p8">That's probably the order of priority, too, though quiet is pretty important and I'm pretty cheap.</p>
            <p id="p9">I'm following what <span class="personname">
                  <span class="firstname">Adam</span> 
                  <span class="surname">Retter</span>
               </span> 
               <a href="http://www.adamretter.org.uk/blog/entries/diy-nas-software_and_hardware.xml"
                  shape="rect">is building</a>. <span class="personname">
                  <span class="firstname">Dave</span> 
                  <span class="surname">Pawson</span>
               </span> 
               <a href="http://www.dpawson.co.uk/nodesets/entries/090603.html" shape="rect">is building</a> one too (and pointed me to <a href="http://www.tomshardware.com/reviews/core-2-overclock,2146.html"
                  shape="rect">Tom's Hardware</a> for more ideas). Coincidentally, I see from this month's <a href="http://www.linuxjournal.com/" shape="rect">Linux Journal</a> that the <a href="http://en.wikipedia.org/wiki/Western%20Digital%20My%20Book"
                  title="Wikipedia: Western Digital My Book"
                  shape="rect">Western Digital My Book</a> 
               <a href="http://en.wikipedia.org/wiki/Western_Digital_My_Book#World_Edition"
                  shape="rect">World Edition</a> is a hackable NAS. That might be the least expensive way to go.</p>
            <p id="p10">But I still lean towards a Solaris box with four (or more) drive bays running ZFS.</p>
            <p id="p11">I wonder what the right answer is?</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Balisage 2009</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/06/03/balisage"/>
      <id>http://norman.walsh.name/2009/06/03/balisage</id>
      <published>2009-06-03T13:41:54Z</published>
      <updated>2009-06-03T14:04:42Z</updated>
      <category term="balisageConference" scheme="http://technorati.com/tag/"/>
      <dc:subject>Balisage</dc:subject>
      <category term="balisageConference09" scheme="http://technorati.com/tag/"/>
      <dc:subject>Balisage2009</dc:subject>
      <category term="marklogic" scheme="http://technorati.com/tag/"/>
      <dc:subject>MarkLogic</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Bring out yer demo for free beer!</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/06/03/balisage">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Bring out yer demo for free beer!</p>
            </div>
            <div class="epigraph">
               <p id="p2">Only passion, great passion, can elevate the soul to great things.</p>
               <div class="attribution">
                  <span class="mdash">—</span>
                  <span class="personname">
                     <span class="firstname">Denis</span> 
                     <span class="surname">Diderot</span>
                  </span>
               </div>
            </div>
            <p id="p1">I'm delighted to learn that the i's have been dotted, the t's crossed, and <a href="http://www.marklogic.com/" shape="rect">Mark Logic</a> is now an official sponsor of <a href="http://balisage.org/" shape="rect">Balisage 2009</a>. Good on us! You all know what a great conference I think it is, so I won't repeat myself. Much.</p>
            <p id="p3">In addition to straight-up sponsoring the conference, we're also going to run a “beer and demo jam” on Tuesday night. I'm not sure we've worked out all the details yet, but I know what we're using as a model.</p>
            <p id="p4">One evening of each of our company meetings is devoted to “Demo Jam” hosted with great panache by <span class="personname">
                  <span class="firstname">Matt</span> 
                  <span class="surname">Turner</span>
               </span> (author of <a href="http://xquery.typepad.com/" shape="rect">Discovering XQuery</a>). All the engineers and consultants are encouraged to demo something. A crowd gathers. Everyone has a few beers and then it's off to the races: five minutes to show anything you want. The winner is judged by the loudest
audience response (measured in true geek fashion, as carefully as one can under the circumstances, with a dB meter and everything). It's great fun all around.</p>
            <p id="p5">So, I don't know exactly what we'll do for beer and demos at Balisage, but if you're coming, bring along your favorite project. Obviously it doesn't have to be Mark Logic related, or XQuery related, or even, for that matter, probably, XML related, though I'm sorta guessing most of them will be.</p>
            <p id="p6">Good cheer and free beer, FTW!</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Miscellany</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/05/29/misc"/>
      <id>http://norman.walsh.name/2009/05/29/misc</id>
      <published>2009-05-29T17:12:26Z</published>
      <updated>2009-05-29T17:55:45Z</updated>
      <category term="balisageConference" scheme="http://technorati.com/tag/"/>
      <dc:subject>Balisage</dc:subject>
      <category term="balisageConference09" scheme="http://technorati.com/tag/"/>
      <dc:subject>Balisage2009</dc:subject>
      <dc:subject>SelfReference</dc:subject>
      <category term="voip" scheme="http://technorati.com/tag/"/>
      <dc:subject>VoIP</dc:subject>
      <category term="xmlss09" scheme="http://technorati.com/tag/"/>
      <dc:subject>XMLSummerSchool2009</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>On speaking engagements (two excellent conferences), anniversaries, and VoIP.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/05/29/misc">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>On speaking engagements (two excellent conferences), anniversaries, and VoIP.</p>
            </div>
            <div class="epigraph">
               <p id="p2">The years teach us much which the days never knew.</p>
               <div class="attribution">
                  <span class="mdash">—</span>
                  <span class="personname">
                     <span class="firstname">Ralph Waldo</span> 
                     <span class="surname">Emerson</span>
                  </span>
               </div>
            </div>
            <p id="p1">On the whole, I try not to travel more than I have to. I try to mitigate the travel by planning ahead. My plans this year now officially include (among other things) two marvelous events: <a href="http://balisage.org/" shape="rect">Balisage</a>, my absolute favorite XML gig, in <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=1240+Drummond,+Montr%C3%A9al,+Qu%C3%A9bec+H3G+1V7,+Canada&amp;sll=37.579413,-95.712891&amp;sspn=51.329563,57.832031&amp;ie=UTF8&amp;z=16&amp;iwloc=r1"
                  shape="rect">Montréal</a> in <a href="http://norman.walsh.name/2009/itinerary/08-10-balisage"
                  title="Balisage, 10-15 August"
                  shape="rect">August</a>, and <a href="http://xmlsummerschool.org/" shape="rect">XML Summer School</a> in <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=St+Edmunds+Hall,+Queen%27s+Lane+Oxford,+Oxford,+OX1+4,+uk&amp;sll=51.75374,-1.251491&amp;sspn=0.009949,0.014119&amp;ie=UTF8&amp;cd=1&amp;z=16"
                  shape="rect">Oxford</a> in <a href="http://norman.walsh.name/2009/itinerary/09-20-xmlss"
                  title="XML Summer School, 20-25 September"
                  shape="rect">September</a>. At Balisage, I'll be talking about XProc, at XML Summer School, open source XML applications and Web 2.0 technologies.</p>
            <p id="p3">Both events offer an opportunity to talk to some of the sharpest markup technogists in the world about the problems, peeves, or practices that you care most about. Highly recommended!</p>
            <p id="p4">In other news, my calendar reminds me that I decloaked this weblog six years ago today. My output has been down a bit of late, I know, but I still enjoy writing when I have the chance and I have (renewed) plans for tinkering with the technologies that underly the site (more fun for me, perhaps, than you, but an important reason why I do it). Thanks for reading!</p>
            <p id="p5">Lastly, perhaps leastly in this mixed bag, I've decided to give <a href="http://en.wikipedia.org/wiki/Skype" title="Wikipedia: Skype" shape="rect">Skype</a> a try in a serious way, mostly because it's incredibly inexpensive but also because I've been using my cable company's VoIP technology for months now without any problems. (I tried Vonage a few years ago, but it was a total failure for me.)</p>
            <p id="p6">If you've a reason to call, my Skype ID is my first and last names (“norman” and “walsh”, respectively) all run together. The inbound phone number is <a href="tel:+1-413-284-4793" shape="rect">+1-413-284-4793</a>. That's a Palmer number, they tell me, basically one town over, because they didn't offer any numbers in Belchertown. (I've only paid for the number through August, 2009. So if it's September 2009 or after, you might want to ask before you call, in case this experiment fails.)</p>
            <p id="p7">Be seeing you! Or hearing you! Or both!</p>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
   <entry>
      <title>Perhaps the penultimate XProc draft</title>
      <link rel="alternate" type="text/html"
            href="http://norman.walsh.name/2009/05/28/xproc"/>
      <id>http://norman.walsh.name/2009/05/28/xproc</id>
      <published>2009-05-28T16:44:39Z</published>
      <updated>2009-05-28T17:58:08Z</updated>
      <category term="w3c" scheme="http://technorati.com/tag/"/>
      <dc:subject>W3C</dc:subject>
      <category term="xml" scheme="http://technorati.com/tag/"/>
      <dc:subject>XML</dc:subject>
      <category term="xproc" scheme="http://technorati.com/tag/"/>
      <dc:subject>XProc</dc:subject>
      <summary type="xhtml">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <p>Today, the XML Processing Model Working Group published a new working draft. Not the very last working draft, but possibly very close.</p>
         </div>
      </summary>
      <content type="xhtml" xml:base="http://norman.walsh.name/2009/05/28/xproc">
         <div xmlns="http://www.w3.org/1999/xhtml">
            <div class="abstract">
               <p>Today, the XML Processing Model Working Group published a new working draft. Not the very last working draft, but possibly very close.</p>
            </div>
            <p id="p1">For the past few months, the <a href="http://www.w3.org/XML/Processing/" shape="rect">XML Processing Model Working Group</a> has been busily resolving <a href="http://www.w3.org/XML/XProc/2008/11/cr-comments/" shape="rect">issues</a> raised during the <a href="http://www.w3.org/2005/10/Process-20051014/tr.html#cfi" shape="rect">CR</a>. So busy with <a href="http://en.wikipedia.org/wiki/XML_pipeline"
                  title="Wikipedia: XML pipeline"
                  shape="rect">XProc</a>
               <a href="/knows/what/xproc" shape="rect">
                  <img border="0" alt="[L]" src="/graphics/linkgroup.gif"/>
               </a>, in
fact, that we forgot about our <a href="http://www.w3.org/2005/10/Process-20051014/groups.html#three-month-rule"
                  shape="rect">heartbeat requirement</a>.</p>
            <p id="p2">Today, we published <a href="http://www.w3.org/TR/2009/CR-xproc-20090528/" shape="rect">a new working draft</a> of <em class="citetitle">
                  <a href="http://www.w3.org/TR/xproc/" shape="rect">XProc: An XML Pipeline Language</a>
               </em>. In addition to editorial improvements and clarifications, this draft contains a small number of significant changes:</p>
            <div class="orderedlist">
               <ol style="list-style: decimal;">
                  <li>
                     <p id="p3">Changed <tt class="tag-starttag">&lt;p:choose&gt;</tt> and <tt class="tag-starttag">&lt;p:try&gt;</tt>. An output port produces a sequence if that port produces a sequence in any subpipeline.</p>
                  </li>
                  <li>
                     <p id="p4">Changed <tt class="tag-starttag">&lt;p:error&gt;</tt>. Added primary output port <tt class="literal">result</tt>.</p>
                  </li>
               </ol>
            </div>
            <p id="p5">Taken together, these two changes make it much easier for pipeline authors to write <tt class="tag-starttag">&lt;p:choose&gt;</tt> steps where one of the branches uses a <tt class="tag-starttag">&lt;p:error&gt;</tt>.</p>
            <div class="orderedlist">
               <ol start="3" style="list-style: decimal;">
                  <li>
                     <p id="p6">Changed <tt class="tag-starttag">&lt;p:exec&gt;</tt>. Added <tt class="option">arg-separator</tt>, <tt class="option">path-separator</tt>, and <tt class="option">failure-threshold</tt>. Input can be zero or one document only. Added support for a result code.</p>
                  </li>
               </ol>
            </div>
            <p id="p7">Implementor experience revealed that our design for <tt class="tag-starttag">&lt;p:exec&gt;</tt> was insufficient. These changes fix problems with platform-specific path separators and dealing with arguments that contain spaces.</p>
            <div class="orderedlist">
               <ol start="4" style="list-style: decimal;">
                  <li>
                     <p id="p8">Clarified <tt class="tag-starttag">&lt;p:http-request&gt;</tt>. Interaction with HTTP redirects and cookies is now explicit; the interaction of media types and serialization, several aspects of multipart messages, and a number of other areas have been clarified as well.</p>
                  </li>
               </ol>
            </div>
            <p id="p9">The <tt class="tag-starttag">&lt;p:http-request&gt;</tt> step was the subject of <em>a lot</em> of discussion. We made quite a few changes, almost exclusively clarifications.</p>
            <div class="orderedlist">
               <ol start="5" style="list-style: decimal;">
                  <li>
                     <p id="p10">Clarified the interpretation of base URIs and the <tt class="tag-attribute">xml:base</tt> attribute.</p>
                  </li>
               </ol>
            </div>
            <p id="p11">This last change clarifies that the inherent base URI of a node can exist independent of an <tt class="tag-attribute">xml:base</tt> attribute. In particular, removing the <tt class="tag-attribute">xml:base</tt> attribute does not change the inherent base URI. (Though adding one does change the base URI, of course, so perhaps “independent” wasn't exactly the right word.)</p>
            <div class="section">
               <h2 class="runin">We're almost done </h2>
               <p class="runin" id="p12">
                  <a id="almost" name="almost" shape="rect"/>The bottom line is that we really are almost finished. The test suite still needs to be fleshed out, and our implementors need to get to complete coverage, but I think that language evolution is coming to an end.</p>
               <p id="p13">Ironically, in the time between requesting publication of today's draft and today, the WG identified and corrected <em>one more</em> issue. We added a <tt class="function">p:value-available()</tt> function to allow pipeline authors to identify options that have no specified value.</p>
               <p id="p14">Clearly, we can't absolutely promise nothing else will change, but the chair is setting the bar pretty high.</p>
            </div>
            <div id="newcomment"/>
            <div class="footer"/>
         </div>
      </content>
   </entry>
</feed>