Building a bigger pipeline

Volume 12, Issue 8; 26 Mar 2009; last modified 31 Aug 2011

Constructing a “real world” XProc pipeline: building the XProc specification with XProc.

Recurses! Called again!

This essay explores XProc in more detail by constructing a fairly complex “real world” example of an XProc pipeline. This task formed the heart of my presentation at XML Prague, a few days ago. This essay was composed from the notes I made before that presentation and from my recollection of what I said. As a result, it's a bit rambling.

To be clear from the start, this is an essay about what you can do with XProc and not an essay that attempts to motivate why you could or should want to do it. I've already written about why pipelines are a good thing. Vojtěch Toman also spoke about XProc at XML Prague (and is also on the working group); his presentation was more motivational than mine, if that's what you're after.

My goal here is to build something that does real work and to see how XProc can be used to address a number of common document-processing tasks. Along the way, I'll examine both the strengths and weaknesses of XProc and address at least one weakness with an extension.

XML's near ubiquity means that the range of applications that perform “XML processing” is exceptionally broad. These applications range from what many of us would recognize as traditional document processing on one end to almost binary, interprocess communication on the other.

XProc's design aims to be applicable wherever the tasks at hand can be described as the application of a series of transformations of XML documents. That's neither a particularly “document centric” view nor a particularly “data centric” view.

However, for many of us, the canonical examples of XProc involve traditional document processing steps: validation, transformation, and XInclude, for example.

It follows that XProc had better be applicable to real world document transformation tasks. To test XProc's capabilities in this area, we'll examine what is a very real-world example to me: construction of XProc: An XML Pipeline Language, the XProc specification itself, from its constituent parts. At the conference, after my talk, someone pointed out that it might have been better to choose some problem that involved a mashup of various web services. Maybe, but that's not what I did (at least this time).

The final document, langspec.html (called simply spec.html in the graphics due to space constraints), that is the formatted prose of the XProc specification, is produced by a process of moderate complexity:

First, some example sources are transformed into examples, this transformation allows us to have testable examples but include only the relevant snippets in the prose of the specification. Next, we build a glossary from the specification source, transforming all the inline definitions into a traditional “back of the book” glossary. These parts, plus other XML parts, are XIncluded together. From this XIncluded whole, a number of ancillary files are generated. Finally, this whole, plus some of those ancillary files, are transformed into the HTML version of the specification.

In the time before XProc, this kind of process was usually managed with Unix Makefiles or ant build scripts. In the particular case of the XProc specification, Unix Makefiles.

The most significant rules in the Makefile are these:

,langspec.xml: langspec.xml parallel.xml \
               schemas/xproc.rnc schemas/xproc.rng \
               standard-components.xml references.xml glossary.xml \
               error-vocabulary.xml conformance.xml namespace-fixup.xml \
               language-summary.xml error-codes.xml
	$(MAKE) -C examples
	$(XINCLUDE) $< > $@

langspec.html: ,langspec.xml typed-pipeline-library.xml \
               ../style/docbook.xsl ../style/dbspec.xsl \
               ../style/xprocns.xsl ../style/rngsyntax.xsl
	$(XJPARSE) $<
	$(SAXON) $< ../style/dbspec.xsl $@
	$(TIDY) --doctype loose --output-xhtml true -q -utf8 -mn $@

Ignoring the recursive make call for a moment, these two rules boil down to this: use XInclude to combine the sources, the validate the sources, style them, and tidy the result.

That's a straight-forward pipeline:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xml:id="b1-xhtml" xml:base="examples/b1-xhtml.xpl">
  <p:input port="parameters" kind="parameter"/>

  <p:xinclude>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>

  <p:exec command="/Users/ndw/bin/tidy-stream"/>

  <p:store href="langspec.html" method="xhtml"/>
</p:declare-step>

This pipeline accepts parameters, which will be passed to the XSLT step automatically if they're provided, but it has no inputs or outputs. Instead, the initial XInclude step loads the document from disk, the XIncluded doucment flows through the validate, XSLT, and tidy steps, then gets serialized and stored on disk at the end by the store step.

Now, as it turns out, there's a problem with this setup. The purpose of the tidy step is to introduce hacks into the XHTML serialization to support various browser bugs and idiosyncrasies. Reparsing that and reserializing it defeats the purpose.

What it boils down to is that the output from the tidy step isn't really XML, it's just text. But that's ok, we can do that:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xml:id="b1-text" xml:base="examples/b1-text.xpl">
  <p:input port="parameters" kind="parameter"/>

  <p:xinclude>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>

  <p:exec result-is-xml="false" command="/Users/ndw/bin/tidy-stream"/>

  <p:store href="langspec.html" method="text"/>
</p:declare-step>

By specifying that the result from the exec step is not XML, we get the results back as text; subsequently serializing that result as text stores the right data to disk.

You could argue that it's inefficient and unnecessary to load the text only to immediately store it again. In fact, I wrote the “tidy-stream” wrapper around my normal tidy process just to simplify the pipelines. In practice, tidy doesn't stream anything, it modifies the documents directly on disk. That pipeline looks like this:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xml:id="b2-nstidy" xml:base="examples/b2-nstidy.xpl">
  <p:input port="parameters" kind="parameter"/>

  <p:xinclude>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>

  <p:store href="langspec.html"/>

  <p:exec result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" args="langspec.html">
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

Here we store the result of the transformation to disk and then run tidy to clean it up. The sink step at the very end explicitly discards the output from exec because we don't care about it anymore.

Unfortunately, this pipeline is not guaranteed to work. The only constraints on step order in XProc are those required by the connections between the steps. If step “B” consumes the output of step “A”, then step B can't finish before step A starts!

But if you look at the store and exec steps in the preceding pipeline, you'll see that there are no connections between them. Effectively, we have a pipeline with two independent sub-pipelines.

As a result, there's no dependency between the store and exec steps. The pipeline processor can execute them in any order, even in parallel. But the correct result requires that the p:store step be executed before p:exec.

There are a few obvious ways to fix this. The most obvious is to introduce a dependency:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xml:id="b2-depend" xml:base="examples/b2-depend.xpl">
  <p:input port="parameters" kind="parameter"/>

  <p:xinclude>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>

  <p:store name="store" href="langspec.html"/>

  <p:exec result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy">
    <p:input port="source"><p:empty/></p:input>
    <p:option name="args" select="string(/)">
      <p:pipe step="store" port="result"/>
    </p:option>
  </p:exec>

  <p:sink/>
</p:declare-step>

The output from the store step is a single result element that contains the absolute URI of the location to which store saved its input. In this example, we use that result as the value of the arg option, guaranteeing both a connection (and therefore a dependency between) the two steps and that we tidy the right file.

But for the sake of argument, suppose that we couldn't do that. Suppose that these two steps were independent but we wanted to force a particular order in order to preserve some side effect.

Version 1.0 of the XProc specification does not provide any mechanism for asserting these “out of band” dependencies. The Working Group didn't feel like it was in a position to accurately predict the sorts of dependencies that authors might need or want to assert, so we're leaving any such mechanisms as implementation extensions. If a clear consensus emerges about what is needed, it will likely become part of some future version of XProc.

What you can do, though it's clearly a bit of a hack is introduce a choose that depends on the output of the store.

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xml:id="b2-choose" xml:base="examples/b2-choose.xpl">
  <p:input port="parameters" kind="parameter"/>

  <p:xinclude>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>

  <p:store name="store" href="langspec.html"/>

  <p:choose>
    <p:xpath-context>
      <p:pipe step="store" port="result"/>
    </p:xpath-context>
    <p:when test="/this-will-always-be-empty">
      <p:error code="hcf">
        <p:input port="source">
          <p:inline><message>This can't happen.</message></p:inline>
        </p:input>
      </p:error>
    </p:when>
    <p:otherwise>
      <p:exec result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" args="langspec.html">
        <p:input port="source"><p:empty/></p:input>
      </p:exec>
      <p:sink/>
    </p:otherwise>
  </p:choose>

</p:declare-step>

Or, being an implementor, I can just introduce an extension.

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:cx="http://xmlcalabash.com/ns/extensions" xml:id="b2-extn" xml:base="examples/b2-extn.xpl">
  <p:input port="parameters" kind="parameter"/>

  <p:xinclude>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>

  <p:store name="store" href="langspec.html"/>

  <p:exec result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" args="langspec.html" cx:depends-on="store">
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

That very straightforward pipeline appears to get the job done, but it doesn't handle any of the underlying complexity. Before we begin to attack that complexity, let's modularize our pipeline a bit. Some things to keep in mind:

  • User-defined pipelines are first-class steps.

  • Think in terms of reuse.

  • Standard software engineering practices apply.

With that in mind, let's decompose our pipeline into a library of steps that we can call.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:id="b3-modular" xml:base="examples/b3-modular.xpl">

<p:declare-step name="main" type="pl:main">
  <p:input port="parameters" kind="parameter"/>

  <pl:format-spec>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </pl:format-spec>

  <pl:tidy href="langspec.html"/>
</p:declare-step>

<p:declare-step name="main" type="pl:format-spec">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xinclude/>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step type="pl:tidy">
  <p:input port="source"/>
  <p:option name="href" required="true"/>

  <p:store name="store">
    <p:with-option name="href" select="$href"/>
  </p:store>

  <p:exec result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" cx:depends-on="store">
    <p:with-option name="args" select="$href">
      <p:empty/>
    </p:with-option>
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

</p:library>

This library has three steps. The pl:main step calls pl:format-spec and pl:tidy. The pl:format-spec step XIncludes, validates, and transforms its input, returning the result. Finally, the pl:tidy step stores its input in the specified location and runs tidy over that location.

All well and good, but we still have some work to do:

  • Make the glossary.

  • Make the ancillary files.

  • Make the namespace documents.

  • Make the examples.

  • Make the schemas.

Making the glossary is straightforward. In pl:format-spec we add the XSLT step to build the glossary, store the results, and make sure that XInclude depends on the glossary so that we'll get the right glossary in the result.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:id="b4-glossary" xml:base="examples/b4-glossary.xpl">

<p:declare-step name="main" type="pl:main">
  <p:input port="parameters" kind="parameter"/>

  <pl:format-spec>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </pl:format-spec>

  <pl:tidy href="langspec.html"/>
</p:declare-step>

<p:declare-step name="main" type="pl:format-spec">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/makeglossary.xsl"/>
    </p:input>
  </p:xslt>

  <p:store name="store-glossary" href="glossary.xml"/>

  <p:xinclude cx:depends-on="store-glossary">
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step type="pl:tidy">
  <p:input port="source"/>
  <p:option name="href" required="true"/>

  <p:store name="store">
    <p:with-option name="href" select="$href"/>
  </p:store>

  <p:exec result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" cx:depends-on="store">
    <p:with-option name="args" select="$href">
      <p:empty/>
    </p:with-option>
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

</p:library>

Similarly, the ancillary files are easy to generate. We simply add a new step that takes the validated document and applies the appropriate transformations. This time we don't care about the order of execution so there aren't any dependency worries.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:id="b5-ancillary" xml:base="examples/b5-ancillary.xpl">

<p:declare-step name="main" type="pl:main">
  <p:input port="parameters" kind="parameter"/>

  <pl:format-spec>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </pl:format-spec>

  <pl:tidy href="langspec.html"/>
</p:declare-step>

<p:declare-step name="main" type="pl:format-spec">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/makeglossary.xsl"/>
    </p:input>
  </p:xslt>

  <p:store name="store-glossary" href="glossary.xml"/>

  <p:xinclude cx:depends-on="store-glossary">
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng name="validated">
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <pl:ancillary-files name="make-ancillary"/>

  <p:xslt name="style" cx:depends-on="make-ancillary">
    <p:input port="source">
      <p:pipe step="validated" port="result"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step name="main" type="pl:ancillary-files">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="pipeline-library.xml"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/typed-pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="typed-pipeline-library.xml"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/error-list.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="error-list.xml"/>
</p:declare-step>

<p:declare-step type="pl:tidy">
  <p:input port="source"/>
  <p:option name="href" required="true"/>

  <p:store name="store">
    <p:with-option name="href" select="$href"/>
  </p:store>

  <p:exec result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" cx:depends-on="store">
    <p:with-option name="args" select="$href">
      <p:empty/>
    </p:with-option>
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

</p:library>

And the namespace documents are just more of the same. I've added a couple of steps to do the proper formatting and I call them from pl:main.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:id="b6-ns" xml:base="examples/b6-ns.xpl">

<p:declare-step name="main" type="pl:main">
  <p:input port="parameters" kind="parameter"/>

  <pl:format-spec>
    <p:input port="source">
      <p:document href="langspec.xml"/>
    </p:input>
  </pl:format-spec>

  <pl:tidy href="langspec.html"/>

  <pl:make-ns/>
</p:declare-step>

<p:declare-step name="main" type="pl:make-ns">
  <p:input port="parameters" kind="parameter"/>

  <pl:format-ns>
    <p:input port="source">
      <p:document href="ns-xproc.xml"/>
    </p:input>
  </pl:format-ns>
  <pl:tidy href="ns-xproc.html"/>

  <pl:format-ns>
    <p:input port="source">
      <p:document href="ns-xproc-step.xml"/>
    </p:input>
  </pl:format-ns>
  <pl:tidy href="ns-xproc-step.html"/>

  <pl:format-ns>
    <p:input port="source">
      <p:document href="ns-xproc-error.xml"/>
    </p:input>
  </pl:format-ns>
  <pl:tidy href="ns-xproc-error.html"/>
</p:declare-step>

<p:declare-step name="main" type="pl:format-spec">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="../style/makeglossary.xsl"/>
    </p:input>
  </p:xslt>

  <p:store name="store-glossary" href="glossary.xml"/>

  <p:xinclude cx:depends-on="store-glossary">
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
  </p:xinclude>

  <p:validate-with-relax-ng name="validated">
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <pl:ancillary-files name="make-ancillary"/>

  <p:xslt name="style" cx:depends-on="make-ancillary">
    <p:input port="source">
      <p:pipe step="validated" port="result"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step name="main" type="pl:ancillary-files">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="pipeline-library.xml"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/typed-pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="typed-pipeline-library.xml"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/error-list.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="error-list.xml"/>
</p:declare-step>

<p:declare-step type="pl:tidy">
  <p:input port="source"/>
  <p:option name="href" required="true"/>

  <p:store name="store">
    <p:with-option name="href" select="$href"/>
  </p:store>

  <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" cx:depends-on="store">
    <p:with-option name="args" select="$href">
      <p:empty/>
    </p:with-option>
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

<p:declare-step name="main" type="pl:format-ns">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>
  
  <p:xinclude/>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

</p:library>

We're making progress, but there's still work to be done:

  • Make the examples.

  • Make the schemas.

But first, a little diversion. We've got a pipeline that does the right thing, but unlike Make and Ant it always builds everything. It doesn't have any built-in mechanism for handling dependencies.

This was an explicit design decision by the working group. The additional complexity of managing dependencies was not considered to be within “the minimum needed to declare victory”. As a WG member and the chair, I stand by that decision.

But, you know, the only thing that Make actually does is compare the timestamps of the files. We've got XPath 2.0, we can compare dates and times. If we had a way to find out the last modified time of the files involved, could we build a dependency checker in XProc?

Enter cx:uri-info, an extension step for getting information about URIs:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:cx="http://xmlcalabash.com/ns/extensions" type="cx:uri-info" xml:base="examples/src/ex-uriinfo.xml">
  <p:output port="result"/>
  <p:option name="href" required="true" cx:type="xsd:anyURI"/>
  <p:option name="username"/>
  <p:option name="password"/>
  <p:option name="auth-method"/>
  <p:option name="send-authorization"/>
</p:declare-step>

Unlike the step types declared in previous pipelines, this one has no body. It's not declared in terms of existing XProc steps; it is a new, implementation-defined atomic step known (at the moment, anyway) only to XML Calabash.

If you point cx:uri-info at a file, you get information about that file:

<c:uri-info xmlns:c="http://www.w3.org/ns/xproc-step" href="langspec.xml" exists="true" readable="true" writable="true" size="213186" absolute="false" directory="false" hidden="false" file="true" last-modified="2009-03-11T18:46:10Z" absolute-path="/projects/presentations/2009/03-xmlprague/examples/spec/docs/langspec.xml" uri="file:/projects/presentations/2009/03-xmlprague/examples/spec/docs/langspec.xml" canonical-path="/projects/presentations/2009/03-xmlprague/examples/spec/docs/langspec.xml" xml:base="examples/src/ex-urifile.xml"/>

If you point it at an HTTP URI, you get the result of a HEAD request:

<c:uri-info xmlns:c="http://www.w3.org/ns/xproc-step" href="http://localhost/" status="200" readable="true" exists="true" uri="http://localhost/" last-modified="2008-08-08T15:13:25Z" size="1226" xml:base="examples/src/ex-urihttp.xml">
  <c:header name="Date" value="Wed, 18 Mar 2009 08:04:32 GMT"/>
  <c:header name="Server" value="Apache/1.3.41 (Darwin)"/>
  <c:header name="Last-Modified" value="Fri, 08 Aug 2008 15:13:25 GMT"/>
  <c:header name="ETag" value="&quot;2cf6b2-4ca-489c6295&quot;"/>
  <c:header name="Accept-Ranges" value="bytes"/>
  <c:header name="Content-Length" value="1226"/>
  <c:header name="Content-Type" value="text/html"/>
</c:uri-info>

(After my presentation, Jeni Tennison suggested that I might instead have added extension attributes to the output of the directory-list step since you can already do HEAD with http-request. I'll probably do that too.)

The last thing we need is some way to describe the dependencies. That's straightforward in XML:

<cx:depends xmlns:cx="http://xmlcalabash.com/ns/extensions" xml:base="examples/src/ex-depends.xml">
  <cx:target>langspec.html</cx:target>
  <cx:source>langspec.xml</cx:source>
  <cx:source>parallel.xml</cx:source>
  <cx:source>schemas/xproc.rnc</cx:source>
  <cx:source>schemas/xproc.rng</cx:source>
  <cx:source>standard-components.xml</cx:source>
  <cx:source>references.xml</cx:source>
  <cx:source>error-vocabulary.xml</cx:source>
  <cx:source>conformance.xml</cx:source>
  <cx:source>namespace-fixup.xml</cx:source>
  <cx:source>language-summary.xml</cx:source>
  <cx:source>error-codes.xml</cx:source>
  <cx:source>../style/docbook.xsl</cx:source>
  <cx:source>../style/dbspec.xsl</cx:source>
  <cx:source>../style/xprocns.xsl</cx:source>
  <cx:source>../style/rngsyntax.xsl</cx:source>
</cx:depends>

(In retrospect, I'm not sure “cx” is the right namespace for that fragment, but nevermind for now.)

With these building blocks, we can now write a pipeline that takes a dependency list and answers the question, “is the target out-of-date with respect to its sources?” This is the most complex and interesting pipeline in this essay so we'll consider it in a little more detail.

We begin with the declaration of the pipeline and a little bit of documentation.

<p:library xmlns:p="http://www.w3.org/ns/xproc"
	   xmlns:c="http://www.w3.org/ns/xproc-step"
	   xmlns:cx="http://xmlcalabash.com/ns/extensions"
	   xmlns:xs="http://www.w3.org/2001/XMLSchema">

<p:import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl"/>

<p:declare-step name="main" type="cx:out-of-date"
		exclude-inline-prefixes="cx xs">
  <p:input port="source"/>
  <p:output port="result"/>

  <p:documentation xmlns="http://docbook.org/ns/docbook">
    <para>The <tag>cx:out-of-date</tag> step takes a document of the form:</para>
    <programlisting><![CDATA[&lt;cx:depends xmlns:cx="http://xmlcalabash.com/ns/extensions">
  &lt;cx:target>langspec.html&lt;/cx:target>
  &lt;cx:source>langspec.xml&lt;/cx:source>
  &lt;cx:source>otherdoc.xml&lt;/cx:source>
&lt;/cx:depends></programlisting>
    <para>If the target document exists and is newer than all of the specified
    sources, then <tag>cx:out-of-date</tag> returns a <tag>c:result</tag>
    document containing the word “false”, otherwise it returns a 
    <tag>c:result</tag> document containing the word “true”. If the target
    is out of date, the <tag>c:result</tag> element will have an
    <tag class="attribute">href</tag> attribute that identifies
    <emphasis>one of</emphasis> the sources that is newer than the target.
    </para>
  </p:documentation>

Then, first things first, if the input document isn't a cx:depends document, fall over.

  <p:choose>
    <p:when test="not(/cx:depends)">
      <p:error code="cx:wrong-input-type">
	<p:input port="source">
	  <p:inline>
	    <message>The cx:out-of-date step expects a cx:depends document.
	    </message>
	  </p:inline>
	</p:input>
      </p:error>
      <p:identity>
	<p:input port="source">
	  <p:inline><doc>This can't happen.</doc></p:inline>
	</p:input>
      </p:identity>
    </p:when>

Otherwise, use cx:uri-info to get the timestamp of the target.

    <p:otherwise>
      <cx:uri-info name="target-info">
	<p:with-option name="href"
		       select="resolve-uri(/cx:depends/cx:target,
			                   base-uri(/cx:depends/cx:target))"/>
      </cx:uri-info>

Then iterate over each of the sources and get its timestamp with cx:uri-info.

      <p:for-each>
	<p:iteration-source select="/cx:depends/cx:source">
	  <p:pipe step="main" port="source"/>
	</p:iteration-source>
	<p:output port="result" sequence="true"/>

	<cx:uri-info name="source-info">
	  <p:with-option name="href"
			 select="resolve-uri(/, base-uri(/*))"/>
	</cx:uri-info>

For each of these sources, we have to see if it exists and if it's newer than the target. We start by saving a few tidbits in variables so we can conveniently refer to them later.

	<p:choose>
	  <p:variable name="target-exists" select="/c:uri-info/@exists">
	    <p:pipe step="target-info" port="result"/>
	  </p:variable>
	  <p:variable name="target-datetime" select="/c:uri-info/@last-modified">
	    <p:pipe step="target-info" port="result"/>
	  </p:variable>
	  <p:variable name="source-exists" select="/c:uri-info/@exists">
	    <p:pipe step="source-info" port="result"/>
	  </p:variable>
	  <p:variable name="source-datetime" select="/c:uri-info/@last-modified">
	    <p:pipe step="source-info" port="result"/>
	  </p:variable>

If the source doesn't exist, fall over. (We could probably extend this pipeline to handle nested dependencies, but that's not necessary today.)

	  <p:when test="$source-exists != 'true'">
	    <p:string-replace match="cx:target">
	      <p:input port="source">
		<p:inline>
		  <message>The target (<cx:target/>) depends on a source file (<cx:source/>) that does not exist.</message>
		</p:inline>
	      </p:input>
	      <p:with-option name="replace"
			     select="concat('&quot;',/c:uri-info/@href,'&quot;')">
		<p:pipe step="target-info" port="result"/>
	      </p:with-option>
	    </p:string-replace>

	    <p:string-replace match="cx:source">
	      <p:with-option name="replace"
			     select="concat('&quot;',/c:uri-info/@href,'&quot;')">
		<p:pipe step="source-info" port="result"/>
	      </p:with-option>
	    </p:string-replace>

	    <p:error code="cx:no-source"/>

	    <p:identity>
	      <p:input port="source">
		<p:inline><doc>This can't happen.</doc></p:inline>
	      </p:input>
	    </p:identity>
	  </p:when>

We want to generate a message and one of the things that's fairly weak in XProc is the construction of literal results. The idiom I often use for this is to construct a message with place-holders and then use string-replace to patch over the place-holders. That's what I've done above.

If the source exists, then we check to see that both the target exists and the target is newer than the source. If both those things are true then the target is not out-of-date with respect to this source.

	  <p:when test="$target-exists = 'true'
			and xs:dateTime($target-datetime)
			    &gt; xs:dateTime($source-datetime)">
	    <p:identity>
	      <p:input port="source">
		<p:inline><c:result>false</c:result></p:inline>
	      </p:input>
	    </p:identity>
	  </p:when>

Otherwise, it is. We use the string-replace trick to put the URI of the more-recent source in the result.

	  <p:otherwise>
	    <p:string-replace match="/c:result/@href">
	      <p:input port="source">
		<p:inline><c:result href="">true</c:result></p:inline>
	      </p:input>
	      <p:with-option name="replace"
			     select="concat('&quot;',/c:uri-info/@href,'&quot;')">
		<p:pipe step="source-info" port="result"/>
	      </p:with-option>
	    </p:string-replace>
	  </p:otherwise>
	</p:choose>
      </p:for-each>
    </p:otherwise>
  </p:choose>

At this point we have a sequence of c:result documents. To produce the final, single true-or-false answer, we wrap the sequence (so that we can perform an XPath query; XPath's are defined over documents not sequences of documents) and ask if there are any “true” results.

  <p:wrap-sequence wrapper="cx:wrapper"/>

  <p:choose>
    <p:when test="/cx:wrapper[c:result = 'true']">
      <p:identity>
	<p:input port="source" select="/cx:wrapper/c:result[. = 'true'][1]"/>
      </p:identity>
    </p:when>
    <p:otherwise>
      <p:identity>
	<p:input port="source">
	  <p:inline><c:result>false</c:result></p:inline>
	</p:input>
      </p:identity>
    </p:otherwise>
  </p:choose>
</p:declare-step>

</p:library>

If there are, we return the first one, otherwise we return false.

Not too shabby.

Now we can make our spec building pipeline sensitive to dependencies and not rebuild things that we don't need to rebuild.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:base="examples/b6.xpl">

<p:import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl"/>
<p:import href="/home/ndw/xmlcalabash.com/library/depends.xpl"/>

<p:pipeinfo xml:id="langspec">
  <cx:depends>
    <cx:target>langspec.html</cx:target>
    <cx:source>langspec.xml</cx:source>
    <cx:source>parallel.xml</cx:source>
    <cx:source>schemas/xproc.rnc</cx:source>
    <cx:source>schemas/xproc.rng</cx:source>
    <cx:source>standard-components.xml</cx:source>
    <cx:source>references.xml</cx:source>
    <cx:source>error-vocabulary.xml</cx:source>
    <cx:source>conformance.xml</cx:source>
    <cx:source>namespace-fixup.xml</cx:source>
    <cx:source>language-summary.xml</cx:source>
    <cx:source>error-codes.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/rngsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:declare-step name="main" type="pl:main-spec">
  <p:input port="parameters" kind="parameter"/>

  <cx:out-of-date>
    <p:input port="source" select="id('langspec')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-spec name="fmat">
	<p:input port="source">
	  <p:document href="langspec.xml"/>
	</p:input>
      </pl:format-spec>
      <pl:tidy href="langspec.html"/>
      <pl:ancillary-files>
	<p:input port="source">
	  <p:pipe step="fmat" port="xincluded"/>
	</p:input>
      </pl:ancillary-files>
    </p:when>
    <p:otherwise>
      <cx:message message="langspec.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>

  <pl:main-ns/>
</p:declare-step>

<p:declare-step name="main" type="pl:format-spec">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>

  <p:output port="result" primary="true"/>
  <p:output port="xincluded">
    <p:pipe step="xinclude" port="result"/>
  </p:output>

  <p:sink/>
  
  <p:xslt name="make-glossary">
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/makeglossary.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="glossary.xml"/>

  <p:xinclude name="xinclude" cx:depends-on="make-glossary">
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:log port="result" href="/tmp/out.xml"/>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step name="main" type="pl:main-req">
  <p:input port="parameters" kind="parameter"/>
  <p:option name="document" select="'langreq.xml'"/>
  <p:option name="href" select="'langreq.html'"/>

  <p:load dtd-validate="true">
    <p:with-option name="href" select="$document">
      <p:empty/>
    </p:with-option>
  </p:load>
  <pl:format-req/>
  <pl:tidy>
    <p:with-option name="href" select="$href"/>
  </pl:tidy>
</p:declare-step>

<p:declare-step name="main" type="pl:format-req">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/langreq.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:pipeinfo xml:id="ns-xproc">
  <cx:depends>
    <cx:target>ns-xproc.html</cx:target>
    <cx:source>ns-xproc.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/elemsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:pipeinfo xml:id="ns-xproc-step">
  <cx:depends>
    <cx:target>ns-xproc-step.html</cx:target>
    <cx:source>ns-xproc-step.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/elemsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:pipeinfo xml:id="ns-xproc-error">
  <cx:depends>
    <cx:target>ns-xproc-error.html</cx:target>
    <cx:source>ns-xproc-error.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/elemsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:declare-step name="main" type="pl:main-ns">
  <p:input port="parameters" kind="parameter"/>

  <cx:out-of-date>
    <p:input port="source" select="id('ns-xproc')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-ns>
	<p:input port="source">
	  <p:document href="ns-xproc.xml"/>
	</p:input>
      </pl:format-ns>
      <pl:tidy href="ns-xproc-step.html"/>
    </p:when>
    <p:otherwise>
      <cx:message message="ns-xproc.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>

  <cx:out-of-date>
    <p:input port="source" select="id('ns-xproc-step')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-ns>
	<p:input port="source">
	  <p:document href="ns-xproc-step.xml"/>
	</p:input>
      </pl:format-ns>
      <pl:tidy href="ns-xproc-step.html"/>
    </p:when>
    <p:otherwise>
      <cx:message message="ns-xproc-step.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>

  <cx:out-of-date>
    <p:input port="source" select="id('ns-xproc-error')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-ns>
	<p:input port="source">
	  <p:document href="ns-xproc-error.xml"/>
	</p:input>
      </pl:format-ns>
      <pl:tidy href="ns-xproc-error.html"/>
    </p:when>
    <p:otherwise>
      <cx:message message="ns-xproc-error.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

<p:declare-step name="main" type="pl:format-ns">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xinclude/>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step type="pl:tidy">
  <p:input port="source"/>
  <p:option name="href" required="true"/>

  <p:store name="store">
    <p:with-option name="href" select="$href"/>
  </p:store>

  <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" args="langspec.html" cx:depends-on="store">
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

<p:declare-step name="main" type="pl:ancillary-files">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="pipeline-library.xml"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/typed-pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="typed-pipeline-library.xml"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/error-list.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="error-list.xml"/>
</p:declare-step>

<p:declare-step name="main" type="pl:main-diff">
  <p:option name="previous" required="true"/>
  <p:option name="current" select="'langspec.html'"/>
  <p:option name="href" select="'diff.html'"/>

  <pl:diff>
    <p:with-option name="previous" select="$previous">
      <p:empty/>
    </p:with-option>
    <p:with-option name="current" select="$current">
      <p:empty/>
    </p:with-option>
  </pl:diff>

  <p:store>
    <p:with-option name="href" select="$href"/>
  </p:store>
</p:declare-step>

<p:declare-step name="main" type="pl:diff">
  <p:option name="previous" required="true"/>
  <p:option name="current" required="true"/>
  <p:output port="result"/>

  <p:load name="previous">
    <p:with-option name="href" select="$previous">
      <p:empty/>
    </p:with-option>
  </p:load>

  <p:load name="current">
    <p:with-option name="href" select="$current">
      <p:empty/>
    </p:with-option>
  </p:load>

  <cx:delta-xml>
    <p:input port="source">
      <p:pipe step="previous" port="result"/>
    </p:input>
    <p:input port="alternate">
      <p:pipe step="current" port="result"/>
    </p:input>
    <p:input port="dxp">
      <p:document href="/usr/local/DeltaXMLCore-5_1/samples/dxp/compare-xhtml.dxp"/>
    </p:input>
  </cx:delta-xml>
</p:declare-step>

</p:library>

Good. Now let's get back to work. We need to make the examples and the schemas. We can take the same general approach as make; create pipelines that build the examples and the schemas, respectively, and then call them from our main pipeline.

Here's the pipeline for examples. It's not too different from the main pipeline that we've been working on.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:base="examples/src/ex-make-examples.xml">

<p:import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl"/>
<p:import href="/home/ndw/xmlcalabash.com/library/depends.xpl"/>

<p:pipeinfo xml:id="xpl-sources">
  <cx:targets>
    <cx:target>file</cx:target>
    <cx:target>filebin</cx:target>
  </cx:targets>
</p:pipeinfo>

<p:pipeinfo xml:id="xml-sources">
  <cx:targets>
    <cx:target>choose</cx:target>
    <cx:target>for-each</cx:target>
    <cx:target>group</cx:target>
    <cx:target>identity</cx:target>
    <cx:target>input-doc</cx:target>
    <cx:target>input-port</cx:target>
    <cx:target>input-select</cx:target>
    <cx:target>pipeline-library</cx:target>
    <cx:target>pipeline</cx:target>
    <cx:target>simple-default</cx:target>
    <cx:target>simple-explicit</cx:target>
    <cx:target>trycatch</cx:target>
    <cx:target>viewport</cx:target>
    <cx:target>xinclude</cx:target>
    <cx:target>xpathcontext</cx:target>
    <cx:target>xquery</cx:target>
    <cx:target>xslt-empty</cx:target>
    <cx:target>xslt</cx:target>
  </cx:targets>
</p:pipeinfo>

<p:pipeinfo xml:id="copy-sources">
  <cx:targets>
    <cx:target>pipeline</cx:target>
    <cx:target>pipeline-library</cx:target>
    <cx:target>simple-default</cx:target>
    <cx:target>simple-explicit</cx:target>
    <cx:target>xquery</cx:target>
  </cx:targets>
</p:pipeinfo>

<p:declare-step name="main" type="pl:build-examples">
  <p:for-each>
    <p:iteration-source select="id('xml-sources')//cx:target">
      <p:document href="#"/>
    </p:iteration-source>
    
    <pl:strip>
      <p:with-option name="source" select="resolve-uri(concat(., '.xml'), base-uri(.))"/>
      <p:with-option name="target" select="resolve-uri(concat(., '.txt'), base-uri(.))"/>
    </pl:strip>
  </p:for-each>

  <p:for-each>
    <p:iteration-source select="id('xpl-sources')//cx:target">
      <p:document href="#"/>
    </p:iteration-source>
    
    <pl:strip>
      <p:with-option name="source" select="resolve-uri(concat(., '.xpl'), base-uri(.))"/>
      <p:with-option name="target" select="resolve-uri(concat(., '.txt'), base-uri(.))"/>
    </pl:strip>
  </p:for-each>

  <p:for-each>
    <p:iteration-source select="id('copy-sources')//cx:target">
      <p:document href="#"/>
    </p:iteration-source>
    
    <pl:copy>
      <p:with-option name="source" select="resolve-uri(concat(., '.xml'), base-uri(.))"/>
      <p:with-option name="target" select="resolve-uri(concat(., '.txt'), base-uri(.))"/>
    </pl:copy>
  </p:for-each>
</p:declare-step>

<p:declare-step name="main" type="pl:copy">
  <p:option name="source" required="true"/>
  <p:option name="target" required="true"/>

  <p:string-replace match="/cx:depends/cx:target/text()">
    <p:input port="source">
      <p:inline>
	<cx:depends>
	  <cx:target>target</cx:target>
	  <cx:source>source</cx:source>
	</cx:depends>
      </p:inline>
    </p:input>
    <p:with-option name="replace" select="concat('&quot;',$target,'&quot;')">
      <p:empty/>
    </p:with-option>
  </p:string-replace>

  <p:string-replace match="/cx:depends/cx:source/text()">
    <p:with-option name="replace" select="concat('&quot;',$source,'&quot;')">
      <p:empty/>
    </p:with-option>
  </p:string-replace>

  <cx:out-of-date/>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <cx:message>
	<p:with-option name="message" select="concat('Copying: ', $target)">
	  <p:empty/>
	</p:with-option>
      </cx:message>
  
      <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/bin/cp">
	<p:input port="source">
	  <p:empty/>
	</p:input>
	<p:with-option name="args" select="concat($source,' ',$target)"/>
      </p:exec>
      
      <p:sink/>
    </p:when>
    <p:otherwise>
      <cx:message>
	<p:input port="source"><p:empty/></p:input>
	<p:with-option name="message" select="concat('Up-to-date: ', $target)">
	  <p:empty/>
	</p:with-option>
      </cx:message>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

<p:declare-step name="main" type="pl:strip">
  <p:option name="source" required="true"/>
  <p:option name="target" required="true"/>

  <p:string-replace match="/cx:depends/cx:target/text()">
    <p:input port="source">
      <p:inline>
	<cx:depends>
	  <cx:target>target</cx:target>
	  <cx:source>source</cx:source>
	</cx:depends>
      </p:inline>
    </p:input>
    <p:with-option name="replace" select="concat('&quot;',$target,'&quot;')">
      <p:empty/>
    </p:with-option>
  </p:string-replace>

  <p:string-replace match="/cx:depends/cx:source/text()">
    <p:with-option name="replace" select="concat('&quot;',$source,'&quot;')">
      <p:empty/>
    </p:with-option>
  </p:string-replace>

  <cx:out-of-date/>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <p:string-replace match="/c:request/@href">
	<p:input port="source">
	  <p:inline exclude-inline-prefixes="cx xs pl">
	    <c:request method="get" override-content-type="text/plain" href=""/>
	  </p:inline>
	</p:input>
	<p:with-option name="replace" select="concat('&quot;',$source,'&quot;')">
	  <p:empty/>
	</p:with-option>
      </p:string-replace>

      <p:http-request/>

      <cx:message>
	<p:with-option name="message" select="concat('Updating: ', $target)">
	  <p:empty/>
	</p:with-option>
      </cx:message>
  
      <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/opt/local/bin/perl" args="strip.pl">
      </p:exec>
      
      <p:unescape-markup/>

      <p:store method="text" name="store">
	<p:with-option name="href" select="$target"/>
      </p:store>

      <p:sink>
	<p:input port="source">
	  <p:pipe step="store" port="result"/>
	</p:input>
      </p:sink>
    </p:when>
    <p:otherwise>
      <cx:message>
	<p:input port="source"><p:empty/></p:input>
	<p:with-option name="message" select="concat('Up-to-date: ', $target)">
	  <p:empty/>
	</p:with-option>
      </cx:message>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

<p:declare-step name="main" type="pl:strip-brute">
  <p:option name="source" required="true"/>
  <p:option name="target" required="true"/>

  <p:string-replace match="/c:request/@href">
    <p:input port="source">
      <p:inline exclude-inline-prefixes="cx xs pl">
	<c:request method="get" override-content-type="text/plain" href=""/>
      </p:inline>
    </p:input>
    <p:with-option name="replace" select="concat('&quot;',$source,'&quot;')">
      <p:empty/>
    </p:with-option>
  </p:string-replace>

  <p:http-request/>

  <cx:message>
    <p:with-option name="message" select="concat('Updating ', $target)">
      <p:empty/>
    </p:with-option>
  </cx:message>

  <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/opt/local/bin/perl" args="strip.pl">
  </p:exec>

  <p:unescape-markup/>

  <p:store method="text">
    <p:with-option name="href" select="$target"/>
  </p:store>
</p:declare-step>

</p:library>

The schemas are a more interesting case. If you look at the Makefile, you'll see that the XSD file is constructed by concatenating several parts together. Here are the relevant rules:

XSD_PARTS=xproc_xsd_1.frag xproc_xsd_2.frag xproc_xsd_3.frag

xproc_xsd_2.frag: ../typed-pipeline-library.xml ../../style/library-to-xsd.xsl
	$(SAXON) ../typed-pipeline-library.xml ../../style/library-to-xsd.xsl | sed '1d;/xmlns:/d;/<.wrapper/d' > $@

xproc.xsd: $(XSD_PARTS)
	cat $^ > $@

This is an approach that works prefectly well in a Makefile, but doesn't work at all in XProc. In XProc, you cannot pass around not-well-formed fragments of XML. We can't construct an unbalanced fragment and glue it to another unbalanced fragment in some other step.

But that's ok, because we have real XML tools at our disposal. An approach that will work is to use XInclude in a skeleton, template schema and generate the included part dynamically. So that's what we do.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:base="examples/src/ex-make-schemas.xml">

<p:import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl"/>
<p:import href="/home/ndw/xmlcalabash.com/library/depends.xpl"/>

<p:pipeinfo xml:id="dtd-parts">
  <cx:targets>
    <cx:target>xproc_dtd_1.ent</cx:target>
    <cx:target>xproc_dtd_2.ent</cx:target>
    <cx:target>xproc_dtd_3.ent</cx:target>
    <cx:target>xproc_dtd_4.ent</cx:target>
  </cx:targets>
</p:pipeinfo>

<p:pipeinfo xml:id="steps-depends">
  <cx:depends>
    <cx:target>steps.rnc</cx:target>
    <cx:source>../typed-pipeline-library.xml</cx:source>
    <cx:source>../../style/library-to-rnc.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:pipeinfo xml:id="xproc-depends">
  <cx:depends>
    <cx:target>steps.rnc</cx:target>
    <cx:source>xproc.rnc</cx:source>
    <cx:source>steps.rnc</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:pipeinfo xml:id="dtd-depends">
  <cx:depends>
    <cx:target>xproc.dtd</cx:target>
    <cx:source>xproc_dtd_1.ent</cx:source>
    <cx:source>xproc_dtd_3.ent</cx:source>
    <cx:source>../typed-pipeline-library.xml</cx:source>
    <cx:source>../../style/library-to-dtd.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:pipeinfo xml:id="xsd-depends">
  <cx:depends>
    <cx:target>xproc.xsd</cx:target>
    <cx:source>xproc_xsd_skeleton.xml</cx:source>
    <cx:source>../typed-pipeline-library.xml</cx:source>
    <cx:source>../../style/library-to-xsd.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:declare-step name="main" type="pl:build-schemas">
  <p:input port="parameters" kind="parameter"/>
  <pl:make-steps/>
  <pl:make-xproc/>
  <pl:make-dtd/>
  <pl:make-xsd/>
</p:declare-step>

<p:declare-step name="main" type="pl:make-steps">
  <p:input port="parameters" kind="parameter"/>

  <cx:out-of-date>
    <p:input port="source" select="id('steps-depends')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <cx:message message="Updating: steps.rnc">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>
      <p:xslt>
	<p:input port="source">
	  <p:document href="../typed-pipeline-library.xml"/>
	</p:input>
	<p:input port="stylesheet">
	  <p:document href="../../style/library-to-rnc.xsl"/>
	</p:input>
      </p:xslt>
      <p:store href="steps.rnc" method="text"/>
    </p:when>
    <p:otherwise>
      <cx:message message="Up-to-date: steps.rnc">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

<p:declare-step name="main" type="pl:make-xproc">
  <cx:out-of-date>
    <p:input port="source" select="id('xproc-depends')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <cx:message message="Updating: xproc.rng">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>
      <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/sourceforge/docbook/buildtools/runtrang" args="xproc.rnc xproc.rng">
	<p:input port="source">
	  <p:empty/>
	</p:input>
      </p:exec>
      <p:sink/>
    </p:when>
    <p:otherwise>
      <cx:message message="Up-to-date: xproc.rng">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

<p:declare-step name="main" type="pl:make-dtd">
  <p:input port="parameters" kind="parameter"/>
  <cx:out-of-date>
    <p:input port="source" select="id('dtd-depends')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <cx:message message="Updating: xproc.dtd">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>

      <p:xslt>
	<p:input port="source">
	  <p:document href="../typed-pipeline-library.xml"/>
	</p:input>
	<p:input port="stylesheet">
	  <p:document href="../../style/library-to-dtd.xsl"/>
	</p:input>
	<p:with-param name="step-cm" select="'xproc_dtd_2.ent'">
	  <p:empty/>
	</p:with-param>
	<p:with-param name="step-decls" select="'xproc_dtd_4.ent'">
	  <p:empty/>
	</p:with-param>
      </p:xslt>
      <p:sink/>

      <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/bin/cat">
	<p:with-option name="args" select="string-join(id('dtd-parts')//cx:target, ' ')">
	  <p:document href="#"/>
	</p:with-option>
	<p:input port="source">
	  <p:empty/>
	</p:input>
      </p:exec>

      <p:store href="xproc.dtd" method="text"/>
    </p:when>
    <p:otherwise>
      <cx:message message="Up-to-date: xproc.dtd">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

<p:declare-step name="main" type="pl:make-xsd">
  <p:input port="parameters" kind="parameter"/>
  <cx:out-of-date>
    <p:input port="source" select="id('xsd-depends')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <cx:message message="Updating: xproc.xsd">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>

      <p:xslt>
	<p:input port="source">
	  <p:document href="../typed-pipeline-library.xml"/>
	</p:input>
	<p:input port="stylesheet">
	  <p:document href="../../style/library-to-xsd.xsl"/>
	</p:input>
      </p:xslt>

      <p:store name="store" href="xproc_xsd_generated.xml" method="xml" indent="true"/>

      <p:xinclude cx:depends-on="store">
	<p:input port="source">
	  <p:document href="xproc_xsd_skeleton.xml"/>
	</p:input>
      </p:xinclude>

      <p:store href="xproc.xsd"/>
    </p:when>
    <p:otherwise>
      <cx:message message="Up-to-date: xproc.xsd">
	<p:input port="source"><p:empty/></p:input>
      </cx:message>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

</p:library>

We're almost done. The last thing is to pull all the pieces together. We'll load our example and schema building pipelines into our main pipeline and call them at the appropriate places.

<p:library xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:pl="http://www.w3.org/XML/XProc/docs/library" xml:id="b7-buildall" xml:base="examples/b7-buildall.xpl">

<p:import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl"/>
<p:import href="/home/ndw/xmlcalabash.com/library/depends.xpl"/>
<p:import href="examples/build.xpl"/>
<p:import href="schemas/build.xpl"/>

<p:pipeinfo xml:id="langspec">
  <cx:depends>
    <cx:target>langspec.html</cx:target>
    <cx:source>langspec.xml</cx:source>
    <cx:source>parallel.xml</cx:source>
    <cx:source>schemas/xproc.rnc</cx:source>
    <cx:source>schemas/xproc.rng</cx:source>
    <cx:source>standard-components.xml</cx:source>
    <cx:source>references.xml</cx:source>
    <cx:source>error-vocabulary.xml</cx:source>
    <cx:source>conformance.xml</cx:source>
    <cx:source>namespace-fixup.xml</cx:source>
    <cx:source>language-summary.xml</cx:source>
    <cx:source>error-codes.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/rngsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:declare-step name="main" type="pl:main-spec">
  <p:input port="parameters" kind="parameter"/>

  <cx:out-of-date>
    <p:input port="source" select="id('langspec')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-spec name="fmat">
	<p:input port="source">
	  <p:document href="langspec.xml"/>
	</p:input>
      </pl:format-spec>
      <pl:tidy href="langspec.html"/>
      <pl:ancillary-files>
	<p:input port="source">
	  <p:pipe step="fmat" port="xincluded"/>
	</p:input>
      </pl:ancillary-files>
    </p:when>
    <p:otherwise>
      <cx:message message="langspec.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>

  <pl:main-ns/>
</p:declare-step>

<p:declare-step name="main" type="pl:format-spec">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>

  <p:output port="result" primary="true"/>
  <p:output port="xincluded">
    <p:pipe step="xinclude" port="result"/>
  </p:output>

  <pl:build-examples/>
  
  <p:xslt name="make-glossary">
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/makeglossary.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="glossary.xml"/>

  <p:xinclude name="xinclude" cx:depends-on="make-glossary">
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:log port="result" href="/tmp/out.xml"/>
  </p:xinclude>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step name="main" type="pl:main-req">
  <p:input port="parameters" kind="parameter"/>
  <p:option name="document" select="'langreq.xml'"/>
  <p:option name="href" select="'langreq.html'"/>

  <p:load dtd-validate="true">
    <p:with-option name="href" select="$document">
      <p:empty/>
    </p:with-option>
  </p:load>
  <pl:format-req/>
  <pl:tidy>
    <p:with-option name="href" select="$href"/>
  </pl:tidy>
</p:declare-step>

<p:declare-step name="main" type="pl:format-req">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/langreq.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:pipeinfo xml:id="ns-xproc">
  <cx:depends>
    <cx:target>ns-xproc.html</cx:target>
    <cx:source>ns-xproc.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/elemsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:pipeinfo xml:id="ns-xproc-step">
  <cx:depends>
    <cx:target>ns-xproc-step.html</cx:target>
    <cx:source>ns-xproc-step.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/elemsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:pipeinfo xml:id="ns-xproc-error">
  <cx:depends>
    <cx:target>ns-xproc-error.html</cx:target>
    <cx:source>ns-xproc-error.xml</cx:source>
    <cx:source>../style/docbook.xsl</cx:source>
    <cx:source>../style/dbspec.xsl</cx:source>
    <cx:source>../style/xprocns.xsl</cx:source>
    <cx:source>../style/elemsyntax.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>

<p:declare-step name="main" type="pl:main-ns">
  <p:input port="parameters" kind="parameter"/>

  <cx:out-of-date>
    <p:input port="source" select="id('ns-xproc')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-ns>
	<p:input port="source">
	  <p:document href="ns-xproc.xml"/>
	</p:input>
      </pl:format-ns>
      <pl:tidy href="ns-xproc-step.html"/>
    </p:when>
    <p:otherwise>
      <cx:message message="ns-xproc.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>

  <cx:out-of-date>
    <p:input port="source" select="id('ns-xproc-step')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-ns>
	<p:input port="source">
	  <p:document href="ns-xproc-step.xml"/>
	</p:input>
      </pl:format-ns>
      <pl:tidy href="ns-xproc-step.html"/>
    </p:when>
    <p:otherwise>
      <cx:message message="ns-xproc-step.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>

  <cx:out-of-date>
    <p:input port="source" select="id('ns-xproc-error')/cx:depends">
      <p:document href="#"/>
    </p:input>
  </cx:out-of-date>

  <p:choose>
    <p:when test="/c:result = 'true'">
      <pl:format-ns>
	<p:input port="source">
	  <p:document href="ns-xproc-error.xml"/>
	</p:input>
      </pl:format-ns>
      <pl:tidy href="ns-xproc-error.html"/>
    </p:when>
    <p:otherwise>
      <cx:message message="ns-xproc-error.html is up-to-date."/>
      <p:sink/>
    </p:otherwise>
  </p:choose>
</p:declare-step>

<p:declare-step name="main" type="pl:format-ns">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>
  <p:output port="result"/>

  <p:xinclude/>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:document href="../schema/dbspec.rng"/>
    </p:input>
  </p:validate-with-relax-ng>

  <p:xslt name="style">
    <p:input port="stylesheet">
      <p:document href="../style/dbspec.xsl"/>
    </p:input>
  </p:xslt>
</p:declare-step>

<p:declare-step type="pl:tidy">
  <p:input port="source"/>
  <p:option name="href" required="true"/>

  <p:store name="store">
    <p:with-option name="href" select="$href"/>
  </p:store>

  <p:exec name="exec" result-is-xml="false" source-is-xml="false" command="/Users/ndw/bin/tidy" args="langspec.html" cx:depends-on="store">
    <p:input port="source"><p:empty/></p:input>
  </p:exec>

  <p:sink/>
</p:declare-step>

<p:declare-step name="main" type="pl:ancillary-files">
  <p:input port="source"/>
  <p:input port="parameters" kind="parameter"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="pipeline-library.xml"/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/typed-pipeline-library.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="typed-pipeline-library.xml"/>

  <pl:build-schemas/>

  <p:xslt>
    <p:input port="source">
      <p:pipe step="main" port="source"/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="../style/error-list.xsl"/>
    </p:input>
  </p:xslt>
  <p:store href="error-list.xml"/>
</p:declare-step>

<p:declare-step name="main" type="pl:main-diff">
  <p:option name="previous" required="true"/>
  <p:option name="current" select="'langspec.html'"/>
  <p:option name="href" select="'diff.html'"/>

  <pl:diff>
    <p:with-option name="previous" select="$previous">
      <p:empty/>
    </p:with-option>
    <p:with-option name="current" select="$current">
      <p:empty/>
    </p:with-option>
  </pl:diff>

  <p:store>
    <p:with-option name="href" select="$href"/>
  </p:store>
</p:declare-step>

<p:declare-step name="main" type="pl:diff">
  <p:option name="previous" required="true"/>
  <p:option name="current" required="true"/>
  <p:output port="result"/>

  <p:load name="previous">
    <p:with-option name="href" select="$previous">
      <p:empty/>
    </p:with-option>
  </p:load>

  <p:load name="current">
    <p:with-option name="href" select="$current">
      <p:empty/>
    </p:with-option>
  </p:load>

  <cx:delta-xml>
    <p:input port="source">
      <p:pipe step="previous" port="result"/>
    </p:input>
    <p:input port="alternate">
      <p:pipe step="current" port="result"/>
    </p:input>
    <p:input port="dxp">
      <p:document href="/usr/local/DeltaXMLCore-5_1/samples/dxp/compare-xhtml.dxp"/>
    </p:input>
  </cx:delta-xml>
</p:declare-step>

</p:library>

Whew! It was a bit of work, and maybe my exposition leaves something to be desired, but we got where we wanted to go: we have an XProc pipeline that builds the XProc specification. As an added bonus, it doesn't rebuild the parts that don't need to be rebuilt.

There's one more part to this story. I wrote all three of the pipelines that we use in the final solution, so I could make them work together nicely. But what if you needed to call some other pipeline that you couldn't directly import into your library? For that you'd need an “eval” extension step that allows you to run arbitrary pipelines. But that's a story for another day.

Comments

Thanks for putting your XML Prague presentation in one blog entry. (Is this the longest entry on your blog so far? :) I have one question about your use of p:pipeinfo in the pipeline, and especially the use of xml:id:
<p:pipeinfo xml:id="steps-depends">
  <cx:depends>
    <cx:target>steps.rnc</cx:target>
    <cx:source>../typed-pipeline-library.xml</cx:source>
    <cx:source>../../style/library-to-rnc.xsl</cx:source>
  </cx:depends>
</p:pipeinfo>
You refer to the p:pipeinfo elements *of the pipeline itself* like this:
<cx:out-of-date>
  <p:input port="source" select="id('steps-depends')/cx:depends">
    <p:document href="#"/>
  </p:input>
</cx:out-of-date>
Is this really possible? I mean, can an XProc pipeline access its source? I understand that href="#" resolves to the owner document, but this is the first time I see this in the XProc context.
—Posted by Vojtech Toman on 27 Mar 2009 @ 07:36 UTC #

Yeah, it is sort of absurdly long. Alternatively, I could have tried to cut the pipelines into snippets and highlight only the changed parts and lots of other things. In that alternate world, I fear the post wouldn't have been made for days, so I decided this was better. :-)

With regard to <p:document href=""> (the # is irrelevant I think), yes I think that should work. It's known to work in XSLT stylesheets, for example.

I was a little surprised, to be honest, when I tried it and it just worked. I think you might find the same thing is true in your implementation. If the href is relative, it has to be resolved against the base URI of the document element. In a pipeline, that's the base URI of the pipeline document, unless someone has used xml:base. If you resolve "" against the current base URI to get an absolute URI, the right thing just happens.

I believe it should work even if xml:base has been used, that is, same document references are immutable with respect to the base URI, but I bet that doesn't (yet) work in my implementation. I haven't tried though.

—Posted by Norman Walsh on 27 Mar 2009 @ 11:48 UTC #