DocBook Title Pages with XSLT 2.0
XSLT 2.0 is around the corner and I’ve started to think about how the DocBook XSL Stylesheets might be improved in an XSLT 2.0 version. One of the things that I’d like to address is the clumsy way title pages are currently handled.
Note
This essay is a bit of a deep dive on XSLT 2.0 and handling a hard customization problem in the DocBook stylesheets. If your interests don’t run along those lines, well, you may get a little bored (more than usual, I mean :-)).
XSLT 2.0 is around the corner (it’s probably a little too early to describe it as “just around the corner”) and I’ve started to think about how the DocBook XSL Stylesheets might be improved in an XSLT 2.0 version. (No, the XSLT 1.0 version isn’t going away any time soon, have no fear.)
One of the things that I’d like to address is the clumsy way that title pages are currently handled. One goal of the stylesheets is to allow end users to customize them without becoming XSLT template-writing wizards. There’s no question that if you’re an XSLT hack, you can do almost anything: the trick is arranging things so that the average mortal can do stuff too.
Lots of customizations: double sided printing, admonition graphics, section numbering, etc., can be achieved with simple parameters, but title pages don’t fall into that category. There’s a lot of variation possible in title pages and no easy way to describe it in terms of scalar parameters.
The current solution is to write an XML document that contains a template that describes the title page you’d like, then process that template with a stylesheet to produce a stylesheet that you actually incorporate into your customization (yet another stylesheet) which you use to produce your output. That’s way too many steps, way to much complexity. The stylesheet produced from the template, by the way, is a horrifying thing.
I think we can do better now. Let’s begin by making three simplifying assumptions:
-
We’re using XSLT 2.0.
-
We’re processing DocBook NG (aka 5.0, we think, eventually). Specifically, we’re processing DocBook elements that can be identified by the fact that they’re in a DocBook namespace.
-
The customizer is going to have to have write some templates, we can’t do this with scalar parameters. The goal is to keep things as simple as possible.
Let’s start with a parameter to describe the title page. How does this look?
<xsl:param name="preface.titlepage.recto">
<div class="titlepage">
<db:title/>
<db:subtitle/>
<db:pubdate/>
<db:author/>
<db:releaseinfo role="cvs"/>
<db:copyright/>
<db:abstract/>
</div>
<hr/>
</xsl:param>
That says that the title page for a Preface should be wrapped in
a <div class="titlepage">
tag,
should include title, subtitle, publication
date, author, release info
identified as “cvs” with the role
attribute,
copyright, and
abstract elements, and should be separated from the rest of the document with
an <hr/>
.
The idea is that the structure of the title page (the
div
and <hr/>
in this case) is written
just the way the customizer wants it. When the stylesheet processes this
parameter, it will replace each of the DocBook elements with the result
of processing the matching element
from the source document.
Unfortunately, it isn’t quite that simple.
Some elements, like bibliography
, have an optional title. If it
isn’t specified in the source document, we need to provide a default.
Unfortunately, if it isn’t in the source document, there isn’t going to be
any element to match, so we’re going to have to be able call named templates.
A couple of attributes from another namespace will provide that functionality:
<xsl:param name="bibliography.titlepage.recto">
<div class="titlepage">
<db:title t:named-template="component.title"
t:force="1"/>
<db:subtitle/>
</div>
<hr/>
</xsl:param>
- 3
-
The
t:named-template
attribute tells the process to call a named template to handle the title, instead of just processing thedb:title
in a mode. - 4
-
The
t:force
attribute tells the process to call the named template even if thedb:title
element isn’t present in the source document.
That’s not too bad.
Ideally, the only thing a customizer should have to do is redefine the appropriate title page template.
Now, before we get started, you might be wondering why we don’t just put all the markup in this template. Like this, perhaps:
<xsl:param name="preface.titlepage.recto">
<div class="titlepage">
<h1><db:title/></h1>
<h2><db:subtitle/></h2>
<h3>Published on: <db:pubdate/></h3>
<h3>By <db:author/></h3>
<h4><db:releaseinfo role="cvs"/></h4>
<h4><db:copyright/></h4>
<div class="abstract">
<db:abstract/>
</div>
</div>
<hr/>
</xsl:param>
The problem is that some elements can be repeated, and sometimes at different
levels (e.g. authorgroup
vs.
author
).
Consider a document with multiple authors, you probably wouldn’t want to
repeat the generated text “By ” in front of each of them. Any system rich enough
to handle that much complexity would, I fear, become as complex as writing the
XSLT by hand. So instead, we rely on XSLT templates for each of the individual elements
to generate the markup for that element, and we hope that the supplied defaults
work most of the time.
Ok, back to work. How are we going to process our template parameter?
First, we have to make sure something gets called to do the processing,
so at the beginning of each XSLT template for an element that can have a title
page (book
, preface
, …, bibliography
, etc.)
we call a titlepage
template. Here’s what a template for
preface
might look like:
<xsl:template match="db:preface">
<div class="preface">
<xsl:call-template name="titlepage">
<xsl:with-param name="info"
select="db:title|db:subtitle|db:titleabbrev|db:info/*"/>
<xsl:with-param name="content" select="$preface.titlepage.recto"/>
</xsl:call-template>
<xsl:call-template name="titlepage">
<xsl:with-param name="side" select="'verso'"/>
<xsl:with-param name="info"
select="db:title|db:subtitle|db:titleabbrev|db:info/*"/>
<xsl:with-param name="content" select="$preface.titlepage.verso"/>
</xsl:call-template>
<xsl:apply-templates/>
</div>
</xsl:template>
- 3
-
The important part here is the call to
titlepage
template, once for the recto titlepage and once for the verso (for XHTML output, the verso one is probably almost always empty, but it’s needed in the FO output). We pass in the list of “info elements” from the source document and the appropriate titlepage parameter. - 16
-
After processing the title page, of course, we have to go on and process the rest of the elements in the
preface
.
The titlepage
named-template
looks like this:
<xsl:template name="titlepage">
<xsl:param name="side" select="'recto'" as="xs:string"/>
<xsl:param name="info" as="element()*"/>
<xsl:param name="content"/>
<xsl:if test="$content instance of document-node()">
<xsl:apply-templates select="$content" mode="titlepage-templates">
<xsl:with-param name="side" select="$side" tunnel="yes"/>
<xsl:with-param name="node" select="." tunnel="yes"/>
<xsl:with-param name="info" select="$info" tunnel="yes"/>
</xsl:apply-templates>
</xsl:if>
</xsl:template>
There are a few XSLT 2.0 features to consider here:
- 2
-
We specify the argument types where it will help catch errors. In this case the “
side
”, which defaults to “recto
”, must be a string and “info
”, the info elements from the source document, must be a sequence of elements. - 6
-
We only do further processing if the “
content
” passed in is a document node (i.e., if the parameter is an empty sequence, we do nothing). - 8
-
We’re taking advantage of tunnel parameters to make argument passing simpler.
- 9
-
From this point on, we’ll be processing the elements in the titlepage parameter, so we pass along the important context (the
preface
, for example) innode
.
XSLT 2.0 features aside, the practical upshot of titlepage
is that it processes each of the elements of the title page parameter
in “titlepage-templates
” mode.
In titlepage-templates
mode, anything not in the
DocBook namespace is just passed straight through. But we need to do some
extra work for DocBook elements:
<xsl:template match="db:*" mode="titlepage-templates" priority="1000">
<xsl:param name="side" tunnel="yes"/>
<xsl:param name="node" tunnel="yes"/>
<xsl:param name="info" tunnel="yes"/>
<xsl:variable name="this" select="."/>
<xsl:variable name="content" select="$info[dbf:node-matches($this,.)]"/>
<xsl:choose>
<xsl:when test="@t:named-template">
<xsl:if test="$content or (@t:force and @t:force != 0)">
<xsl:call-template name="titlepage-templates">
<xsl:with-param name="template" select="@t:named-template" tunnel="yes"/>
<xsl:with-param name="content" select="$content" tunnel="yes"/>
<xsl:with-param name="this" select="$this" tunnel="yes"/>
</xsl:call-template>
</xsl:if>
</xsl:when>
<xsl:otherwise>
<!-- $content may be empty -->
<xsl:apply-templates select="$content" mode="titlepage.mode"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
- 7
-
The
node-matches
function is an XSLT 2.0 stylesheet function, we’ll get to that in a minute. For any given DocBook element in the titlepage parameter, we select the info elements from the source document that match it. - 10
-
If the titlepage parameter element has a
t:named-template
attribute, then… - 11
-
If some elements were selected, or it also has a non-zero
t:force
attribute, then… - 12
-
We call the
titlepage-templates
template to process it, passing - 15
-
the current template node as a parameter.
- 21
-
Otherwise, we just process the source document’s matching elements in
titlepage.mode
.
The node-matches
function looks like this:
<xsl:function name="dbf:node-matches" as="xs:boolean">
<xsl:param name="template-node" as="element()"/>
<xsl:param name="document-node" as="element()"/>
<xsl:choose>
<xsl:when test="node-name($template-node) = node-name($document-node)">
<xsl:variable name="attrMatch">
<xsl:for-each select="$template-node/@*[namespace-uri(.) = '']">
<xsl:variable name="aname" select="local-name(.)"/>
<xsl:variable name="attr" select="$document-node/@*[local-name(.) = $aname]"/>
<xsl:choose>
<xsl:when test="$attr = .">1</xsl:when>
<xsl:otherwise>0</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:variable>
<xsl:choose>
<xsl:when test="not(contains($attrMatch, '0'))">
<xsl:value-of select="dbf:user-node-matches($template-node, $document-node)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="false()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="false()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:function>
- 6
-
Two nodes can only match if they have the same QName.
- 8
-
If they do, we have to loop through all the attributes on the template node (that aren’t namespace qualified).
- 12
-
If the document element has a corresponding attribute with the same value, it matches.
- 13
-
Otherwise, it doesn’t match.
- 19
-
If all the attributes that were specified did match, then we have a matching node. Otherwise we don’t.
If the node matches, we call user-node-matches
,
which gives the customizer an opportunity to add additional selection criteria.
The default implementation of user-node-matches
always
returns true()
.
The titlepage-templates
template looks like this:
<xsl:template name="titlepage-templates">
<xsl:param name="side" tunnel="yes"/>
<xsl:param name="node" tunnel="yes"/>
<xsl:param name="info" tunnel="yes"/>
<xsl:param name="template" tunnel="yes"/>
<xsl:param name="content" tunnel="yes"/>
<xsl:param name="this" tunnel="yes"/>
<xsl:choose>
<xsl:when test="$template = 'component.title'">
<xsl:call-template name="component.title"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="user-titlepage-templates"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
XSLT doesn’t allow template names to be dynamically constructed, so
our mechanism for calling the named template identified in
t:named-template
devolves into a big
xsl:choose
statement. The
user-titlepage-templates
is an extension point for
customizers that add a new name. The default implementation
of user-titlepage-templates
terminates the
stylesheet.
That’s the bulk of the work. Each template for a DocBook element in
“titlepage.mode
” is passed three parameters: the
side
, the node
, and the
set of info
elements.
To see all these bits in action, run this example through your favorite XSLT 2.0 processor. The input document is irrelevant, it always uses an internal document as its test data.
Disclaimer: the stylesheets, like DocBook itself, are very much a group effort and these scribblings are just my initial thoughts; they are subject to change. Though if they do change, I’ll try to write about that too.
In terms of XSLT 2.0, I don’t know which is more interesting, the fact that I used so few type-related features, or the fact that I used any at all. The few that I did use could easily be removed.
Could it be done in XSLT 1.0? Yes, probably, with the
node-set
extension function and a fair bit of cleverness
to replace the node-matches
stylesheet function.
Comments
Thanks Norm for the meaty offering. I'm really fascinated by this sort of thing. I'm interested in custom XML template languages for how they can make software easier to read, as well as empower others that have different skills. I'm also interested in the role XSLT plays in implementing them. I respect your relentlessness in seeking the perfect solution.
I think that node-matches function can be even simplyfied using XPath 2.0 construct "every ... in ... satisfies ...".
It is very hard to do something more complicated in XSLT 1.0 once one become more familiar with 2.0. :-(