DocBook Title Pages with XSLT 2.0

Volume 7, Issue 135; 27 Jul 2004; last modified 08 Oct 2010

XSLT 2.0 is around the corner and I’ve started to think about how the DocBook XSL Stylesheets might be improved in an XSLT 2.0 version. One of the things that I’d like to address is the clumsy way title pages are currently handled.

Note

This essay is a bit of a deep dive on XSLT 2.0 and handling a hard customization problem in the DocBook stylesheets. If your interests don’t run along those lines, well, you may get a little bored (more than usual, I mean :-)).

XSLT 2.0 is around the corner (it’s probably a little too early to describe it as “just around the corner”) and I’ve started to think about how the DocBook XSL Stylesheets might be improved in an XSLT 2.0 version. (No, the XSLT 1.0 version isn’t going away any time soon, have no fear.)

One of the things that I’d like to address is the clumsy way that title pages are currently handled. One goal of the stylesheets is to allow end users to customize them without becoming XSLT template-writing wizards. There’s no question that if you’re an XSLT hack, you can do almost anything: the trick is arranging things so that the average mortal can do stuff too.

Lots of customizations: double sided printing, admonition graphics, section numbering, etc., can be achieved with simple parameters, but title pages don’t fall into that category. There’s a lot of variation possible in title pages and no easy way to describe it in terms of scalar parameters.

The current solution is to write an XML document that contains a template that describes the title page you’d like, then process that template with a stylesheet to produce a stylesheet that you actually incorporate into your customization (yet another stylesheet) which you use to produce your output. That’s way too many steps, way to much complexity. The stylesheet produced from the template, by the way, is a horrifying thing.

I think we can do better now. Let’s begin by making three simplifying assumptions:

We’re using XSLT 2.0.
We’re processing DocBook NG (aka 5.0, we think, eventually). Specifically, we’re processing DocBook elements that can be identified by the fact that they’re in a DocBook namespace.
The customizer is going to have to have write some templates, we can’t do this with scalar parameters. The goal is to keep things as simple as possible.

Let’s start with a parameter to describe the title page. How does this look?

<xsl:param name="preface.titlepage.recto">
  <div class="titlepage">
    <db:title/>
    <db:subtitle/>
    <db:pubdate/>
    <db:author/>
    <db:releaseinfo role="cvs"/>
    <db:copyright/>
    <db:abstract/>
  </div>
  <hr/>
</xsl:param>

That says that the title page for a Preface should be wrapped in a <div class="titlepage"> tag, should include title, subtitle, publication date, author, release info identified as “cvs” with the role attribute, copyright, and abstract elements, and should be separated from the rest of the document with an <hr/>.

The idea is that the structure of the title page (the div and <hr/> in this case) is written just the way the customizer wants it. When the stylesheet processes this parameter, it will replace each of the DocBook elements with the result of processing the matching element from the source document.

Unfortunately, it isn’t quite that simple. Some elements, like bibliography, have an optional title. If it isn’t specified in the source document, we need to provide a default. Unfortunately, if it isn’t in the source document, there isn’t going to be any element to match, so we’re going to have to be able call named templates. A couple of attributes from another namespace will provide that functionality:

<xsl:param name="bibliography.titlepage.recto">
  <div class="titlepage">
    <db:title t:named-template="component.title"
              t:force="1"/>
    <db:subtitle/>
  </div>
  <hr/>
</xsl:param>

3: The t:named-template attribute tells the process to call a named template to handle the title, instead of just processing the db:title in a mode.
4: The t:force attribute tells the process to call the named template even if the db:title element isn’t present in the source document.

That’s not too bad.

Ideally, the only thing a customizer should have to do is redefine the appropriate title page template.

Now, before we get started, you might be wondering why we don’t just put all the markup in this template. Like this, perhaps:

<xsl:param name="preface.titlepage.recto">
  <div class="titlepage">
    <h1><db:title/></h1>
    <h2><db:subtitle/></h2>
    <h3>Published on: <db:pubdate/></h3>
    <h3>By <db:author/></h3>
    <h4><db:releaseinfo role="cvs"/></h4>
    <h4><db:copyright/></h4>
    <div class="abstract">
      <db:abstract/>
    </div>
  </div>
  <hr/>
</xsl:param>

The problem is that some elements can be repeated, and sometimes at different levels (e.g. authorgroup vs. author). Consider a document with multiple authors, you probably wouldn’t want to repeat the generated text “By ” in front of each of them. Any system rich enough to handle that much complexity would, I fear, become as complex as writing the XSLT by hand. So instead, we rely on XSLT templates for each of the individual elements to generate the markup for that element, and we hope that the supplied defaults work most of the time.

Ok, back to work. How are we going to process our template parameter?

First, we have to make sure something gets called to do the processing, so at the beginning of each XSLT template for an element that can have a title page (book, preface, …, bibliography, etc.) we call a titlepage template. Here’s what a template for preface might look like:

<xsl:template match="db:preface">
  <div class="preface">
    <xsl:call-template name="titlepage">
      <xsl:with-param name="info"
                      select="db:title|db:subtitle|db:titleabbrev|db:info/*"/>
      <xsl:with-param name="content" select="$preface.titlepage.recto"/>
    </xsl:call-template>

    <xsl:call-template name="titlepage">
      <xsl:with-param name="side" select="'verso'"/>
      <xsl:with-param name="info"
                      select="db:title|db:subtitle|db:titleabbrev|db:info/*"/>
      <xsl:with-param name="content" select="$preface.titlepage.verso"/>
    </xsl:call-template>

    <xsl:apply-templates/>
  </div>
</xsl:template>

3: The important part here is the call to titlepage template, once for the recto titlepage and once for the verso (for XHTML output, the verso one is probably almost always empty, but it’s needed in the FO output). We pass in the list of “info elements” from the source document and the appropriate titlepage parameter.
16: After processing the title page, of course, we have to go on and process the rest of the elements in the preface.

The titlepage named-template looks like this:

<xsl:template name="titlepage">
  <xsl:param name="side" select="'recto'" as="xs:string"/>
  <xsl:param name="info" as="element()*"/>
  <xsl:param name="content"/>

  <xsl:if test="$content instance of document-node()">
    <xsl:apply-templates select="$content" mode="titlepage-templates">
      <xsl:with-param name="side" select="$side" tunnel="yes"/>
      <xsl:with-param name="node" select="." tunnel="yes"/>
      <xsl:with-param name="info" select="$info" tunnel="yes"/>
    </xsl:apply-templates>
  </xsl:if>
</xsl:template>

There are a few XSLT 2.0 features to consider here:

2: We specify the argument types where it will help catch errors. In this case the “side”, which defaults to “recto”, must be a string and “info”, the info elements from the source document, must be a sequence of elements.
6: We only do further processing if the “content” passed in is a document node (i.e., if the parameter is an empty sequence, we do nothing).
8: We’re taking advantage of tunnel parameters to make argument passing simpler.
9: From this point on, we’ll be processing the elements in the titlepage parameter, so we pass along the important context (the preface, for example) in node.

XSLT 2.0 features aside, the practical upshot of titlepage is that it processes each of the elements of the title page parameter in “titlepage-templates” mode.

In titlepage-templates mode, anything not in the DocBook namespace is just passed straight through. But we need to do some extra work for DocBook elements:

<xsl:template match="db:*" mode="titlepage-templates" priority="1000">
  <xsl:param name="side" tunnel="yes"/>
  <xsl:param name="node" tunnel="yes"/>
  <xsl:param name="info" tunnel="yes"/>

  <xsl:variable name="this" select="."/>
  <xsl:variable name="content" select="$info[dbf:node-matches($this,.)]"/>

  <xsl:choose>
    <xsl:when test="@t:named-template">
      <xsl:if test="$content or (@t:force and @t:force != 0)">
        <xsl:call-template name="titlepage-templates">
          <xsl:with-param name="template" select="@t:named-template" tunnel="yes"/>
          <xsl:with-param name="content" select="$content" tunnel="yes"/>
          <xsl:with-param name="this" select="$this" tunnel="yes"/>
        </xsl:call-template>
      </xsl:if>
    </xsl:when>
    <xsl:otherwise>
      <!-- $content may be empty -->
      <xsl:apply-templates select="$content" mode="titlepage.mode"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

7: The node-matches function is an XSLT 2.0 stylesheet function, we’ll get to that in a minute. For any given DocBook element in the titlepage parameter, we select the info elements from the source document that match it.
10: If the titlepage parameter element has a t:named-template attribute, then…
11: If some elements were selected, or it also has a non-zero t:force attribute, then…
12: We call the titlepage-templates template to process it, passing
15: the current template node as a parameter.
21: Otherwise, we just process the source document’s matching elements in titlepage.mode.

The node-matches function looks like this:

<xsl:function name="dbf:node-matches" as="xs:boolean">
  <xsl:param name="template-node" as="element()"/>
  <xsl:param name="document-node" as="element()"/>

  <xsl:choose>
    <xsl:when test="node-name($template-node) = node-name($document-node)">
      <xsl:variable name="attrMatch">
        <xsl:for-each select="$template-node/@*[namespace-uri(.) = '']">
          <xsl:variable name="aname" select="local-name(.)"/>
          <xsl:variable name="attr" select="$document-node/@*[local-name(.) = $aname]"/>
          <xsl:choose>
            <xsl:when test="$attr = .">1</xsl:when>
            <xsl:otherwise>0</xsl:otherwise>
          </xsl:choose>
        </xsl:for-each>
      </xsl:variable>

      <xsl:choose>
        <xsl:when test="not(contains($attrMatch, '0'))">
          <xsl:value-of select="dbf:user-node-matches($template-node, $document-node)"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="false()"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="false()"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:function>

6: Two nodes can only match if they have the same QName.
8: If they do, we have to loop through all the attributes on the template node (that aren’t namespace qualified).
12: If the document element has a corresponding attribute with the same value, it matches.
13: Otherwise, it doesn’t match.
19: If all the attributes that were specified did match, then we have a matching node. Otherwise we don’t.

If the node matches, we call user-node-matches, which gives the customizer an opportunity to add additional selection criteria. The default implementation of user-node-matches always returns true().

The titlepage-templates template looks like this:

<xsl:template name="titlepage-templates">
  <xsl:param name="side" tunnel="yes"/>
  <xsl:param name="node" tunnel="yes"/>
  <xsl:param name="info" tunnel="yes"/>
  <xsl:param name="template" tunnel="yes"/>
  <xsl:param name="content" tunnel="yes"/>
  <xsl:param name="this" tunnel="yes"/>

  <xsl:choose>
    <xsl:when test="$template = 'component.title'">
      <xsl:call-template name="component.title"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:call-template name="user-titlepage-templates"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

XSLT doesn’t allow template names to be dynamically constructed, so our mechanism for calling the named template identified in t:named-template devolves into a big xsl:choose statement. The user-titlepage-templates is an extension point for customizers that add a new name. The default implementation of user-titlepage-templates terminates the stylesheet.

That’s the bulk of the work. Each template for a DocBook element in “titlepage.mode” is passed three parameters: the side, the node, and the set of info elements.

To see all these bits in action, run this example through your favorite XSLT 2.0 processor. The input document is irrelevant, it always uses an internal document as its test data.

Disclaimer: the stylesheets, like DocBook itself, are very much a group effort and these scribblings are just my initial thoughts; they are subject to change. Though if they do change, I’ll try to write about that too.

In terms of XSLT 2.0, I don’t know which is more interesting, the fact that I used so few type-related features, or the fact that I used any at all. The few that I did use could easily be removed.

Could it be done in XSLT 1.0? Yes, probably, with the node-set extension function and a fair bit of cleverness to replace the node-matches stylesheet function.

Comments

Thanks Norm for the meaty offering. I'm really fascinated by this sort of thing. I'm interested in custom XML template languages for how they can make software easier to read, as well as empower others that have different skills. I'm also interested in the role XSLT plays in implementing them. I respect your relentlessness in seeking the perfect solution.

I think that node-matches function can be even simplyfied using XPath 2.0 construct "every ... in ... satisfies ...".

It is very hard to do something more complicated in XSLT 1.0 once one become more familiar with 2.0. :-(