Useful non-conformance: on the intersection of text/plain documents, RFC 5147, XInclude, and XML Calabash.
I believe I implemented this back in October, but appear never to have written about it. It came in handy (yet again) while writing my Balisage paper, so here it is.
In technical documentation, you're often writing about plain text artifacts that have significant semantics: program listings, shell scripts, configuration files, etc.; in the XML world: stylesheets, schemas, etc. Cutting and pasting these files into your documentation is always a mistake: if you're documenting a real system, these files always change over time and keeping them up-to-date that way is a nightmare. Also, embedding them often makes it very difficult to check for errors.
Incorporating whole files as plain text is straightforward:
1<xi:include href="examples/schema.rnc" parse="text"/>
If your authors can be persuaded to write some fairly awkward prose, (“See lines 24-35 in Listing 3.12 on page 92.”) this is almost enough. But note that it's still a nightmare to maintain because lines 24-35 might be different after some bug is fixed.
What you really want is the ability to incorporate small sections of
larger files in individual examples. (“The
foobar function from
interesting.cpp is shown in Listing 4.9, it's salient features
If your build environment is managed by developers willing to work a little, it might be possible to set up the system so that the real files can be automatically broken into chunks that authors can incorporate with XInclude. But it's a lot of work and there are lots of folks working without such generous developers.
What you really want is a way for authors to point to a file and
extract from it just the relevant lines, without involving already
overworked developers. In an XML context, you want a fragment
Enter RFC 5147. RFC 5147 provides just that, a fragment identifier scheme for plain text documents. It allows you to refer to both ranges of characters within a plain text document and ranges of lines. For extra bonus points, it even supports integrity check mechanisms for identifying cases when the file may have changed, making the fragment identifiers unreliable.
Great! So I can use it in XInclude today to...why are you shaking your head?
Here we run afoul of standardization. Folks, like myself, who write standards, struggle often with interoperability and the future. Interoperability, the point of standardization, encourages us to tighten every screw we can, to nail down every edge case we can think of so that different implementations will interpret the specification in the same, interoperable way. And yet, we never know the future. We can but gaze at cloudy crystal balls in an attempt to imagine how needs might change.
How does this relate to the problem at hand? Well, for (good,
valid) reasons I won't go into now, XInclude separated the fragment identifier
out of the URI into a separate attribute called
For good, valid reasons, XPointers are explicitly about addressing into XML, and,
for good, valid reasons, XInclude says the
attribute is only allowed when parsing XML.
All of which means we have no way of using the RFC 5147 fragment identifier scheme in XInclude.
Except, [expletive deleted, -ed] that!
In XML Calabash, I decided
to allow the
xpointer attribute to use RFC 5147
fragment identifiers when
parse="text". That means in, for example,
my Balisage paper, I can write things like this:
1<xi:include href="examples/schema.rnc" parse="text" xpointer="text(line=20,44;length=1032)"/>
To extract portions of a text file. XML Calabash only supports the “length”
integrity constraint, but that works fine. If I edit
then the length is likely to change and this XPointer will fail. (When this
XPointer fails, my pipeline will fail, and I'll know that there's something
I have to address: I won't silently get the wrong text on lines 20-44.)
As a nod to standards compliance (I do write standards after
all), you must enable the
extension for this to work. This extension makes XML Calabash
non-conformant (in precisely the way you want it to be
As a further nod to standards compliance (I do write standards after all!), I've persuaded the XML Core Working Group at the W3C to include this issue in the XInclude 1.1 Requirements and Use Cases. Your feedback is encouraged.
Share and enjoy!