XProc parameters

Volume 10, Issue 58; 13 Jun 2007; last modified 08 Oct 2010

Dealing with command line options and parameters turns out to be trickier than it looks.

The working group has spent a gruelling few weeks (well, gruelling for the chair, at least) hashing out a story about parameters. Herein, an overview of the decisions we made. Fair warning: as of 13 June 2007, what's discussed in this essay is not yet in any public working draft.The XProc WG works in public. You're welcome to follow along and read the editor's drafts. In other words, I'm not giving away anything not already in the public record, in case you were wondering.

Let's start with a little context: what's the whole deal with options and parameters?

Options are strings that you can pass to a step, like the match expression on p:insert or the href URI on p:load. They can be static values supplied by the pipeline author or they can be computed dynamically (using XPath 1.0) in the pipeline. The names of all the options are known in advance; usually declared as part of p:declare-step. Options are lexically scoped.

Parameters, like options, are name/value pairs that you can pass to a step. Unlike options, the names of parameters aren't (necessarily) known in advance to the pipeline author.

Parameters exist to satisfy the use case of parameters passed to an XSLT step. (They're also used by the XQuery step and may be used by additional steps as well, including extension steps invented by others.)

Stylesheets can have literally hundreds of parameters. The DocBook XSL Stylesheets have more than six hundred. The working group is convinced that we need to support the use case where an XSLT step is wrapped in a pipeline and then parameters are passed to that pipeline (from the command line, for example) and forwarded to the XSLT step.

Consider the pipeline in Simple XInclude and style pipeline.

Example 1. Simple XInclude and style pipeline

<p:pipeline name="main" xmlns:p="http://www.w3.org/2007/03/xproc">
<p:input port="source"/>
<p:output port="result"/>

<p:xinclude/>

<p:xslt>
  <p:input port="stylesheet">
    <p:document href="docbook.xsl"/>
  </p:input>
</p:xslt>

</p:pipeline>

From the user perspective, this simple pipeline represents a natural evolution from running XSLT directly to running a small pipeline that does XInclude and then runs XSLT. If we tell the user that, as a consequence of putting the XSLT step inside the pipeline, they can no longer pass any parameters to the XSLT step, I think it'll be very difficult to get users to migrate to XProc. If we tell them that they have to enumerate all six hundred or so parameters, I think that'll be a non-starter as well.

The story in the current working draft (dated 5 April 2007) is that parameters are lexically scoped, like options, and they are all passed to every step.

Although this is simple, there are two significant problems:

Always passing every in-scope parameter to every step might introduce unwanted behavior in some steps. There isn't enough granularity of control. It also makes it more complicated to pass the same parameter with two different values to two different steps.
It offers no mechanism for computing parameters dynamically. Given all the power we've provided to compute and manipulate inputs to steps, it seems a little odd that you can't compute the parameters you want to pass.

A lot of discussion took place and a range of proposals were considered. I'll leave the mail archives to tell the whole story and limit myself here to only what was decided.

First, we've added the notion of a parameter set, not unlike an XSLT attribute set:

<p:parameter-set name="dbparams">
  <p:parameter name="base.dir" value="/tmp/"/>
  <p:parameter name="html.ext" value=".htm"/>
</p:parameter-set>

Instead of using lexical scoping, we allow the step author to identify which parameter sets they want:

<p:xslt>
  <p:use-parameter-set name="dbparams"/>
  <p:input port="stylesheet">
    <p:document href="docbook.xsl"/>
  </p:input>
</p:xslt>

Steps can use more than one parameter set and parameter sets can be combined by placing a p:use-parameter-set inside a p:parameter-set. What's more, you can place a p:document, p:pipe, or p:inline inside a p:parameter-set too. If you do that, you can load a set of parameters, possibly a set computed by some other step. (There's a small XML vocabulary for defining parameters that I won't attempt to describe here.)

Now we have a way for pipeline authors to construct sets of parameters and use them on the steps where they're appropriate. So far so good. But what about parameters passed to the pipeline, for example, on the command line, through an API, or on a pipeline that's called from another pipeline?

The simplest answer to that question is to provide a name for those parameters. The special parameter set name “#pipeline-parameters” always refers to the set of parameters passed to the current p:pipeline.

In order to close the loop on allowing pipelines to perform computations on sets of parameters, we provide a p:parameters step which has no inputs and provides, on its single output, all of the parameters passed to it. This step uses the same XML vocabulary that p:parameter-set expects, so you can read them, filter them, and write them back out to a new set.

That's the whole story and, though it may not be the simplest possible story, experience suggests that's the simplest thing we can think of which will solve the two most significant requirements for arbitrary name/value pairs: control over the granularity of use and the ability to compute them dynamically.

So far unresolved is the question of what the default behavior should be. Is the pipeline in Pipeline with explicit parameters the same as the one in Simple XInclude and style pipeline?

Example 2. Pipeline with explicit parameters

<p:pipeline name="main" xmlns:p="http://www.w3.org/2007/03/xproc">
<p:input port="source"/>
<p:output port="result"/>

<p:xinclude/>

<p:xslt>
  <p:use-parameter-set name="#pipeline-parameters"/>
  <p:input port="stylesheet">
    <p:document href="docbook.xsl"/>
  </p:input>
</p:xslt>

</p:pipeline>

In other words, if no parameters are specified at all, should the pipeline parameters be inferred, or should no parameters be inferred?

The arguments in favor of inferring the pipeline parameters appeal to ease-of-use, syntactic simplicity, and user expectations. The arguments against appeal to safety, security, and user expectations.

I'm on the fence about this one, though I lean towards requiring the explicit p:use-parameter-set. Today, anyway.

Comments

Hi Norm

I'm not really convinced by the use case of the XSLT parameters. As how I see the problem, XProc parameters and XSLT parameters are not the same thing. They resemble each other, but only by accident (well, they solve the same problem regarding their reespective language).

I would rather say that the XSLT step can have a number of options: the input document, the stylesheet, the initial mode, the parameter set, etcetera. But I don't see why it would take the pipeline's parameters as its own. What if you have two XSLT steps that have incompatible parameters that share the same names?

In my humble opinion, I would rather see something like the following (I didn't follow the XProc story as closely as I would have liked, so please be lenient about possible stupidities):

<p:pipeline name="main" xmlns:p="http://www.w3.org/2007/03/xproc">

  <p:input port="source"/>
  <p:input port="own-params"/>
  <p:input port="db-params"/>
  <p:output port="result"/>

  <!--
    First, preprocess our own XML language instance to get
    Docbook instances.  User can set some params for the
    stylesheet (here, "params" is an XSLT concept).
  -->
  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="own-to-db.xsl"/>
    </p:input>
    <p:input port="params" select="$own-params"/>
  </p:xslt>

  <!--
    Process XInclude.
  -->
  <p:xinclude/>

  <!--
    Then format the resulting Docbook document.  User can
    set some params for the stylesheet (here, "params" is an
    XSLT concept).
  -->
  <p:xslt>
    <p:input port="stylesheet">
      <p:document href="docbook.xsl"/>
    </p:input>
    <p:input port="params" select="$db-params"/>
  </p:xslt>

</p:pipeline>

I guess I misunderstood something, but I can't see what.

Anyway, I'm looking forward to listen to you at XML Prague in a few days. Regards,

--drkm

Hi Florent,

Good to see you in Prague! I think the parameter solution we arrived at is very much along the lines you propose. I'll write up a few more thoughts about XProc parameters when I get a chance.