As I've said, to the extent that there is a “which is better, DocBook or DITA?” debate, I'm trying to stay out of it. I'm not an impartial observer and I don't relish the role of “defender of DocBook”. Nor do I find any prospect of pleasure in “attacking DITA”. So mostly, I ignore the issue.
Edd Dumbill’s piece, Lovely DITA, DocBook fades? has been floating on a tab in my browser all week. It's a fair and balanced essay, generally pretty favorable towards DocBook, title notwithstanding, so I've got nothing to complain about…except for the part where he says
DocBook is a fixed element and attribute set, DITA is extensible, allowing the definition of custom information types
That's just flatly wrong. The part about DocBook, anyway. DocBook has always been one of the most extensible schemas around. We didn't put thousands of parameter entities in the DTD version just so we could watch early SGML parsers crash. Nor have we carefully and deliberately established extensible patterns throughout the RELAX NG version solely for our own amusement.
What Edd is appealing to here is DITA's notion of “specialization”. The marketing folks behind DITA products and consultancies have latched onto specialization as if it were some sort of silver bullet to slay the document interchange monster. A little web searching for “DITA specialization” will turn up plenty of hype.
I'm not saying it isn't useful, but let's consider what it really means. The idea behind specialization is that when I invent a new element, I declare what existing element it “specializes”. In theory, this declaration allows a tool processing my document to fall back to some default processing if it doesn't understand my specialization. The thing to remember is, the extent to which this fall back processing is useful or correct depends largely on the importance of the semantics of your new element.
Let's consider a case where it works reasonably well. Suppose I make lots of references to Wikipedia in my writing. Everytime I make a Wikipedia reference, I type markup like this:
After a while, I get tired of this and decide that I'm going to
craft an extension to simplify my life.
Instead of writing it all out,
I create a
wikipedia element that specializes
Whether this particular specialization is possible in DITA, I don't know.
But the principle stands irrespectively.
link. Now I can just write:
I'll customize my system to handle the
but if I send the document to you without my customizations, your
system will fall back to
link processing. That means you
won't get exactly the right output, but it'll be pretty close. You'll
get the right text, but not the link.
On the other end of the spectrum, suppose you want to add something
with quite specific semantics.
diagrams, for example. Here you're going to have to have elements like
non-terminal. There's unlikely to be anything in the base schema
with semantics that are even close to that, so you'll probably end up having to
specialize paragraphs and phrases. If you take a quick look at what
a typical EBNF diagram
is supposed to look like, you'll have no difficulty imagining
how completely unusable the fall back presentation of that markup is going
I've already shown how the notion of specialization could easily be added to DocBook, and there might be some value in doing so, but it would be a foolish mistake to believe that such a feature was going to fix all your interchange problems. Interchange is a complex issue that needs to be approached thoughtfully.
And there's nothing about it that makes DITA more extensible than DocBook.
It's all mostly irrelevant anyway. If experience with DocBook is indicative, and I think it is, very few users are every going to make any customizations to the markup at all. Sure, some big companies will hire consultants to craft customizations for them, but they're in the minority. Most users just grab the schema and start using it.