<?xml version="1.0" encoding="UTF-8"?>
<essay xml:lang="en" version="pto" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:gal="http://norman.walsh.name/rdf/gallery#">
<info>
    
    
    
    
    
    
    
    
    
    
    
    
<title>Reconfigurable RELAX NG Grammars</title><biblioid class="uri">http://norman.walsh.name/2003/09/15/reconfigrng</biblioid>
<volumenum>6</volumenum>
<issuenum>84</issuenum>
<pubdate>2003-09-15</pubdate>
<date>$Date: 2005-09-11 10:27:02 -0400 (Sun, 11 Sep 2005) $</date>
<author>
      <personname>
<firstname>Norman</firstname>
	<surname>Walsh</surname>
</personname>
    </author>
<copyright>
      <year>2003</year>
      <holder>Norman Walsh</holder>
    </copyright>
<abstract>
<para>RELAX NG is the future for DocBook. But getting a working RELAX NG
grammar is only a small part of the battle. We also need to satisfy the
requirements of a reasonable evolution path for DocBook.
It's going to be a challenge, but a fun one, I think!
</para>
</abstract>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#DocBook"/>
</info>

<para xml:id="p1">This essay is part of my ongoing exploration of what a
refactored DocBook might look like. As I've said before, these are my
thoughts as I hold them today. They're nobody else's and I reserve the
right to change my mind later.</para>

<para xml:id="p2">I'm convinced that RELAX NG is the future for DocBook. And
building a RELAX NG grammar for DocBook isn't hard. But just having
the resulting grammar isn't going to satisfy the requirements as I see
them. I want to build a system that will also support the following
requirements:</para>

<orderedlist>
<listitem>
<para xml:id="p3">It must be possible to generate DTDs and XML Schemas from the
RELAX NG grammar.
</para>
</listitem>
<listitem>
<para xml:id="p4">I want to take advantage of the additional expressive power of
RELAX NG to more accurately reflect the intended semantics of DocBook.
For example, there are several elements that have a pair of
attributes, like these:</para>

<screen>attribute class { "doi" | "isbn" | ... | "other" }
attribute otherclass { text }</screen>

<para xml:id="p5">The intended semantic is that if <quote>
	  <literal>other</literal>
	</quote>
is chosen for the <tag class="attribute">class</tag> attribute, the
appropriate other value should be given in the
<tag class="attribute">otherclass</tag> attribute. With RELAX NG,
it's possible to express these co-constraints and I wish to do so.
There's an even more interesting case surrounding the use of titles either
inside or outside the appropriate info wrapper.</para>
</listitem>
</orderedlist>

<para xml:id="p6">It's pretty obvious that these two goals are in direct conflict
with each other. There's no way to express the semantics of the latter
example in DTD or W3C XML Schemas. <application>Trang</application> does a nice
job of generalizing to make DTDs and XML Schemas out of RELAX NG grammars,
but I'm not sure it can practically be expected to unwind my intentions
in cases like these.</para>

<para xml:id="p7">That means arranging for an automated transformation system that
can produce an at least mostly deterministic RELAX NG
grammar for some generalization of DocBook that we can hand to
<application>Trang</application>.
</para>

<para xml:id="p8">Another requirement I want to achieve has to do with the ease of use of
DocBook as a system, not particularly of the grammar.</para>

<orderedlist continuation="continues">
<listitem>
<para xml:id="p9">I want users to be able to mix-and-match subsets and supersets with
relative ease. In the current DTD, there are all sorts of parameter entities
that allow one to make subsets or extensions, but that configurability isn't
exposed to the casual user.</para>

<para xml:id="p10">Quick! Write a subset of DocBook that doesn't have any trace of
<tag>msgset</tag> or <tag>refentry</tag>s. Or, quick!, add a
new kind of admonition to DocBook called <tag>alert</tag> so
that it has the right content model and can appear everywhere that the
existing admonitions are allowed.</para>
</listitem>
</orderedlist>

<para xml:id="p11">RELAX NG already has facilities for grammar extension and
grammar redefinition, but I'm looking for something even easier.
(Maybe that's a mistake, maybe I shouldn't be going here.)</para>

<para xml:id="p12">Frankly, I want something more like the
<link xlink:href="http://www.tei-c.org/pizza.html">TEI Pizza Chef</link>. Users
should be able to say, <quote>I want DocBook with HTML Tables but
without CALS Tables or <tag>MsgSet</tag> and its ilk or
callouts.</quote> For some set of predefined modules, they should just
be able to push a button and get a subset like that.</para>

<para xml:id="p13">Naturally, whatever system we use for this has to be robust enough
to allow a more experienced user to configure additional modules this way.
</para>

<para xml:id="p14">I think it would be possible to take this idea too far, I don't
see any value in building a system that allows you to select each
individual element. There are just too many variations that won't be
useful. Do you really <emphasis>need</emphasis> to be able to trivially
select <tag>programlistingco</tag> without also getting
<tag>screenco</tag>? I don't think so.</para>

<para xml:id="p15">Some proto-version of my initial exploration of these ideas is starting
to take shape. There's no distribution for it yet (and please don't ask, it's
just too early), but you can get it
<link xlink:href="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/docbook/docbook/relaxng/">from CVS</link>. It's in the
<filename>docbook/relaxng</filename> directory, if you already have the
<link xlink:href="http://docbook.sourceforge.net/">DocBook</link> repository checked out.
</para>

<para xml:id="p16">I'll write up some notes about how it works tomorrow.</para>

<para xml:id="p17">Fair warning: you'll need <application>Make</application>,
<application>Saxon</application> (or your EXSLT-aware XSLT processor of choice),
<application>Trang</application>, and (optionally)
<application>Perl</application> installed and ready to go.</para>

<para xml:id="p18">In a nutshell: author in the compact syntax without a bunch of
DocBook idioms that can be machine generated (like the role and common
attributes for every element), add annotations to describe complex
structures that we know will need to be rendered very differently for
deterministic languages, add annotations for easy modularity, convert
our modules to the XML syntax, combine the requested modules together,
and massage lightly to build the final grammar.</para>

</essay>

