The DocBook Encoding Initiative or “TextBook”?
In a lot of ways, DocBook and the TEI are very similar. I spent most of today looking over the TEI Meta language and the constructs in DocBook and the TEI. Maybe it’s possible to design our schemas so that they can easily interoperate. In any event, a few touristy snaps of Oxford as well.
I think there’s always been some good-humored competition between DocBook and the TEI. Or maybe it’s just between Sebastian and myself; his invitation to join the TEI Meta working group came in a message with the subject “Working with the enemy?” It was an invitation that I accepted gladly.
In a lot of ways, DocBook and the TEI are very similar: they are both large, rich schemas designed for marking up textual documents in ways that convey semantic information relevant to a human reader. And while each has particular strengths, to a certain extent, both can stretch to fill the others shoes: many of the Oxford University Computing Services web pages are created from a TEI extension and I’ve seen works on Islamic architecture authored in DocBook.
The working groups responsible for DocBook and the TEI are both considering how to migrate to their next major versions (coincidentally, both of which will be version 5). The groups are also, less coincidentally I think, both actively exploring RELAX NG as the natural schema language in which to express their designs.
Since I was already on this side of the pond for other business and a few days vacation, I jumped at the chance to come down to Oxford for a day and meet face-to-face.


So Sebastian Rahtz , Lou Burnard , and I spent most of today looking over the TEI Meta language and the markup constructs in DocBook and the TEI. As we pored over the schemas, we talked about the best way to express one constraint or another. The TEI is built from a true literate programming system, so there were some interesting issues to discuss.

One topic that we came back to many times was, would it be possible to design our schemas so that they could easily import modules from each other? For example, could DocBook be structured so that TEI could easily import the GUI inlines if someone wanted to write a book about computer software in TEI? Or could TEI be structured so that DocBook could easily import markup for dictionary entries if someone wanted to write a dictionary about computer terms.
In a DTD world, I think this would have been practically impossible. But RELAX NG deals a lot more intelligently with extension; constructs like interleave can be used to extend patterns without completely redefining them. (In DTDs, parameter entities can be defined exactly once so you really have to jump through hoops.)
With a little coordination, I think we’ll be able to define “link points,” where DocBook and the TEI share common pattern names. I believe that will make it possible to insert modules across schema boundaries at those points.
Who’s Namespace is This?
For inline elements and elements with very specific content models,
it seems reasonable to use namespaces. So a DocBook guimenuitem
might be a db:guimenuitem
in the TEI (assuming we put
DocBook in a namespace).
Now consider the following scenario. Suppose it’s possible to configure DocBook and the TEI so that DocBook can import the TEI module that supports markup for plays. It’s a stretch, but maybe you’re writing a book that uses dialog between two programmers to make your points.
At some point inside the dialog markup, you’re going to get to
“paragraphs” and you’re going to want DocBook
para
’s, not TEI p
s. If DocBook and the
TEI coordinate well, customizations like this will work just fine. The
module for plays will interleave itself into the right patterns so that
it can be used in DocBook and it will refer to some pattern for “paragraph
content” that will be appopriately defined in each schema.
What namespace should the play elements be in? Are they TEI elements with DocBook content, or are they DocBook elements isomorphic to the TEI elements? Should DocBook and TEI be in the same namespace?
(Maybe you disagree and think that after you switch to TEI, it’s TEI all the way to the leaves. That’s a defensible position, but I don’t think it’s as interesting or useful.)
Comments
The link to TEI at the top goes to "Tax Executives Institute". I assume you meant http://www.tei-c.org/
Indeed. A natural hazard of off-line composition, I suppose.
I think you spelled Sebastian's name wrong. Yes, write a book on TEI / Docbook interleaving...(grin) There are hard decisions to know what markup to use...I think it all boils down to tools... but what do I know. I'm using LaTeX currently due to the toolset and the finer granularity in this.
Les
Ack. Thanks, Les, and sorry, Sebastian. Fixed.
You spelled my name wrong too. You might also like to fix the URL for the OUCS website: it should be http://www.oucs.ox.ac.uk Nice pictures though.
Happy new year!
Lou
Errr. No. 1. Different purposes? 2. V.little overlap[docbook vs tei] 3. SR steals from charities.