NYMUG Summary

Volume 12, Issue 41; 12 Nov 2009; last modified 08 Oct 2010

Last night, I spoke at the inaugural New York Mark Logic User Group meeting. I think it was a crowd pleaser, or at least, the punchline at the end was.

The real purpose of a user group is to bring together users (and prospective users). There was much lively discussion after my presentation, which I won't attempt to recapitulate here. The next NYMUG meeting will (most likely) be sometime in January, so please plan to come. I'll post more concrete details here when they're available, and we'll send them to the mailing list, of course.

My biggest problem in preparing for speaking events is figuring out what to talk about, and then figuring out how to stretch that topic to fit in the time allotted.

When I was suggested as the speaker for our inaugural New York meeting, I had to figure out what to talk about. I quickly thought of a topic, but never imagined that it'd be possible. To my delight and surprise, when I shopped the idea around engineering and marketing, there was universal support for the idea. So after I picked my jaw up off the floor, I had to turn my attention to stretching the topic.

The topic I had in mind would easily fit on a couple of slides. That would make for a pretty short talk. At a user group meeting, maybe that wouldn't be all bad, but I felt I needed to say a bit more. I stretched things out by talking about three new and cool features that I thought some folks might not have seen or used yet.

Support for https: and URI rewriting

Support for https: is pretty self-explanatory. Lots of sites have private information (user profiles and passwords, for example) that should not (many would say “must not”) be sent over an insecure communications channel. Furthermore, most users have been trained to look for a secure connection before sending credit card or other financial information over the web.

Support for URI rewriting allows application authors to make cleaner interfaces. I'm a fan of clean URI interfaces. Call me picky, but think it's a lot better to expose the 4th slide in your presentation as http://example.com/slides/nymug/4 than as http://example.com/slides.xqy?deck=nymug.xml&foil=4&format=html.

Until recently, if you wanted to do this with a web site built on top of MarkLogic Server, you had to put up a proxy of some sort (often an Apache web server) to provide https: support and URI rewriting.

Starting with MarkLogic Server V4.1, this is no longer necessary. MarkLogic server now supports https (with your own certificate or a generated, self-signed one) out of the box. The server also supports URI rewriting by allowing you to designate an arbitrary query module to rewrite URIs. Here's the example I used in the presentation:

xquery version "1.0-ml";

declare variable $url as xs:string
        := xdmp:get-request-url();

if (matches($url, "^/slides/([^/]+)/([0-9]+)$"))

It's an incomplete, toy example taken from the real code I used on the server behind my presentation, but it gives you a flavor for how it works. Your module starts with the URL that that was used (and access to the headers and other parts of the request), performs any sort of computation that you'd like, and returns the new URI. The new URI then goes into the server and is processed normally.

There may still be good reasons to put a proxy in front of MarkLogic Server (load balancing, etc.), but you no longer have to just to satisfy requests for these two common and simple features.

Office Toolkits

For reasons that will become clear later on (and not only because I find “office applications” to be an inefficient, frustrating, pointless time-suck), I wanted to present this presentation using ordinary web technologies. (Specifically, HTML+CSS+JavaScript served up by MarkLogic Server.)

At the same time, because I was going to talk about new server features, I was required to present a disclaimer:


All statements describing future releases, estimated release dates and content are plans only, and Mark Logic is under no obligation to develop, include or make available, commercially or otherwise, any specific feature or functionality in any Mark Logic product.

Information is provided for general understanding and informational purposes only, and is subject to change at the sole discretion of Mark Logic in response to changing customer requirements, market conditions, delivery schedules and other factors.

(A disclaimer that applies as much to this weblog essay as it did to my presentation last night, I might add.)

Trouble is, this slide was sent to me in Powerpoint. To use it, I'd have to switch to Powerpoint for the rest of my presentation (yuck!), copy and paste the text (where's the fun in that?), or find some way to incorporate the slide into my DocBook-based slide deck (now that sounds interesting).

Luckily, one of our engineers, Pete Aven has already done all the heavy lifting. Pete's the primary developer of Mark Logic's open source toolkits for Word, Excel, and Powerpoint. Each toolkit provides a pipeline for ingesting office documents into MarkLogic server, an office plugin for using the server from the application, and some XQuery code to work wtih the files in the server.

With that framework in place, it was pretty easy to write a little bit of XQuery code that would extract a slide from a deck and transform it into DocBook. (That's about 20 lines of code, nothing fancy, it extracts paragraphs and bulleted lists from slides, no more, no less.)

The source for my final presentation looks like this:

<slides xmlns="http://docbook.org.ns/docbook">
    <title>Transforming XML Development <?lb?>with MarkLogic</title>


    <foil pptx="/pptx/disclaimer" foil="1"/>


a straightforward mixture of hand-authored slides and references to slides from a couple of Powerpoint decks. For the presentation, I edited a Powerpoint slide and ran it back through the process in real time, but that doesn't translate very well to a weblog essay.


That left just the last part of my talk, “the big reveal” as it were. Having set this all up so that I can author in DocBook, even including Powerpoint slides, serve it up over https:, and use nice looking URIs, I still have to go the last mile and get the content into the browser.

A couple of obvious ways present themselves. I could serve it up as XML with a stylesheet and let the browser do the work. Could do, but I didn't. I could translate the DocBook markup into (X)HTML using XQuery in MarkLogic Server. Could do, but I didn't.

What I really want, but haven't been able to do, is to transform the DocBook in MarkLogic Server with XSLT. And <cue>drum roll</cue> … I can haz! <cue>cymbal crash</cue>

let $doc := xdmp:document-get(concat($ROOT, $xml))/*
let $expanded := local:expand-powerpoint($doc)
let $map := map:map()
let $put := map:put($map, ...("dbp:foil")), $foil)
let $put := map:put($map, ...("dbp:deck")), $deck)
let $put := map:put($map, ...(xs:QName("dbp:format")),
  xdmp:xslt-invoke($xslt, $expanded, $map)

Running an internal build, I can demonstrate support for XSLT 2.0 in MarkLogic Server. (Go back and read that disclaimer again now.)

There was some rejoicing at the user group meeting, I hope there's some rejoicing out there over the intertubes too. I'm certainly giddy with delight over the prospects of high-performance XSLT processing in MarkLogic Server V.future.

What more can I say. When you've done your best trick, it's time to get off the stage.

Thanks again to Steve Kotrch and Simon & Schuster for hosting. Hope to see you all in January!


Hi, I'm a MarkLogic Server user wannabe, and I have to say I was a bit shocked to read that the current version doesn't support XSLT transformations. Or is it just that it doesn't support XSLT 2.0? (Although, it seems that the standard DocBook stylesheets are XSLT 1.0.)

But my real question is, what's up with this _slides_ document? I'm also new to DocBook, and am currently working my way through TDG, and I don't see any reference to slides. Google gives me a hodgepodge of links, the most authoritative I found is here: http://me.in-berlin.de/~miwie/presentations/html/dbslides.html , which indicates that this is an extension you wrote, but the link there is broken. Is there a definitive RelaxNG grammar for these? Does it exist as an extenstion to DocBook 5.0?


—Posted by Klortho on 13 Nov 2009 @ 05:16 UTC #

The current release of MarkLogic Server doesn't support XSLT, only XQuery. Hopefully that won't be true for very much longer :-)

Slides is one of a number of informal extensions to DocBook (Website and Simplified are the others that come immediately to mind). Now that DocBook V5.0 is official, I'll see about getting new releases of these customization layers out.

—Posted by Norman Walsh on 13 Nov 2009 @ 07:06 UTC #