It's been a few years since I first considered DITA specialization. I wonder if I missed the point? I think that might depend on the assumptions that I brought to the table.

One of the most interesting hallway conversations that I had at Balisage was with Eliot Kimber. He and I spent about two hours exploring the differences and similarities between DocBook and DITA.

Here's my synthesis of Eliot's position:

The only difference between DITA and DocBook is specialization, and specialization is why DITA is better.

I'll accept the first part of that observation without argument. To ordinary mortals, DITA and DocBook might look very different, but Eliot is as skilled a markup wrangler as you're likely to encounter. Assuming you worked out a mapping for whatever semantic ambiguity there might be in your corpus, given time and inclination, I'll grant that Eliot could convert anything into anything else. So in that broad sense, they're the same.

That just leaves specialization. I've historically not been impressed by specialization. But I've been wrong before.

In talking to Eliot and thinking about it afterwards, I've come to realize that I'm burdened by a particular set of assumptions, formed long ago, that may no longer usefully guide me through the real world.

One of the critical, motivating goals for adopting an XML (or SGML) publishing system in techpubs was the ability, when all was said and done, to demonstrate a lights-out, high quality, print publication system. You poured valid documents in at the top and aesthetically pleasing, professionally typeset pages that adhered strictly to the organization's design and style guidelines came out the other end.

The way, the only way, that this was achieved was to start with quality tools (editors, formatters, typesetters) and customize each of them, perhaps significantly, until all the stake holders signed off that the results were up to the required standards of information content, layout, design, and typography.

That was then. This is now?

  1. Lights-out publishing is still a requirement for some organizations, especially in core techpubs, but I think there's also evidence that tools like Adobe InDesign have relaxed this requirement for many more organizations. If you're going to pour the markup I send you through a visual tool and make even a light manual pass over the document, you can afford to be a lot more forgiving about the markup I send you. And a system that is “a little bit forgiving” is vastly, spectacularly easier to implement than one that is “not forgiving at all”.

  2. If the web has taught us anything, it's that quality hardly matters at all. Sturgeon's Law applies. We now routinely accept layouts that no self-respecting publisher would have had the temerity to propose years ago.

  3. And finally, this printing ink on dead trees at a thousand-plus DPI, who does that? It's a rare piece of software or hardware that comes with more than a pamphlet these days. Environmentally, that's probably a good thing, but it has done nothing to improve the quality (see previous point) of what's produced.

How is this related to specialization? It's related to specialization because specialization is about interchange. DocBook has always been about interchange: precise, carefully managed interchange.

Specialization is about blind interchange: I send you my documents, documents that contain markup you've never seen before, you run them through your normal toolchain, and the results are “good enough”.

If you're carrying around the assumptions I outline above, blind interchange is a manifestly absurd notion. But if you relax your assumptions to perhaps more accuratly reflect the twenty-first century, then maybe blind interchange becomes not just possible, but practical.

And maybe, just maybe, that makes specialization intersting.

Don't pay any attention to those creaking, scraping noises that you hear. That's just me rearranging my mental furniture. More to follow.


[1]No, really!

Comments:

Hmm. I still don't believe that DITA-style specialization is worth the overhead it adds. Also, what can you do with DITA-style specialization that you can't do with an attribute (e.g. class="checklist") and help from your authoring tool (i.e. the presents "checklist" as if it were an element you can insert even though it really inserts )?

Posted by David on 31 Aug 2010 @ 01:51pm UTC #

Yes, the overhead concerns me too. That's one of the essays to come.

Posted by Norman Walsh on 01 Sep 2010 @ 12:38pm UTC #

The overhead that specialization adds translates into the slightly more overhead that a class adds to a document. True, there's the definition of the specialized object (whether it be a domain, element, or attribute) and the required processing to render the specialized object.

However, keeping in mind blind interchange, those costs are not born by the receiver of the content because the default processing for the more generalized object comes into play. The bottom line is that the overhead is paid by the organization that requires the more specific content model and no other organization (unless they require the same specificity).

Posted by Julio Vazquez on 07 Sep 2010 @ 03:27pm UTC #

It seems to me that DITA specialization is about more than just interchange, since it provides a somewhat more formal structure for customization. Where DocBook makes it easy to add new elements to the syntax, DITA goes one further by allowing that element to declare the basis of its semantics, as well. Your DITA-aware toolchain (editor, publishing system, possibly your CMS) already knows the basic behaviors to apply to the element, from which you can then customize.

I think this structure also does more for interchange with DITA than you give it credit for. If I customize DocBook, I might not be able to send my documents to anyone else to work with without my customizations, since they won't be "standard DocBook" anymore, unless my customizations are strictly limited to just reduced content models. If I specialize DITA, I can always very easily produce "standard DITA", at the worst, or send just my DTD specialization module, and any other DITA-aware system will be able to do basic processing.

So, you say, this is just the "blind" interchange scenario. However, DITA specialization encourages creation and sharing of different types of specializations, such as industry-specific tag sets, which can be further specialized by individual companies. So, there's a good chance that the other party with whom I want to share my content will be able to process my content at a level higher than just standard DITA, since they're likely to be using, or at least have access to, that same industry-specific specialization.

This is certainly more than "blind" interchange, and while it may not be "20/20", it is still a step in the right direction. In short, DITA provides an architecture that encourages carefully-planned, downward-compatible customization. The architecture also provides excellent support for multiple layers of customization, each building on the last, so that different groups can interoperate at the highest layer shared by both parties.

Posted by Brandon Ibach on 16 Sep 2010 @ 12:40am UTC #

I don't think you were wrong the first time around. At the end of the specialization piece you wrote:

"If experience with DocBook is indicative, and I think it is, very few users are every going to make any customizations to the markup at all."

That has been my experience as well. Customizations are few and far between, pretty much regardless of the DTD or schema used. Of course, individual users might do them but organizations for the most part do not. As for specializations in DITA, I've regarded them as a fallback mechanism anyway, not something that's done regularly.

Then again, I could be wrong, too.

Posted by Ari Nordström on 22 Sep 2010 @ 11:03am UTC #

Having just worked with a fairly extensive DITA learning specialization specialization (in other words, customized learning content), I can say that the DITA architectural spec for fallback to topic elements is a real benefit. To be able to configure an XML editor to use this degree of customization in minutes using the customer's supplied plugin and catalog files along with DITA OT was fantastic. Since the XML applications are now bundling DITA OT and support for integrating specializations, it is easier than ever to immediately see the results of a topic map or build an output to show that the specialization is working properly.

So, you who have oodles of experience working with DocBook, how does this compare?

On the other hand, I pity the poor developer who created the additional DTDs, ents and mods, and made the plugin and catalog files that use them. This aspect of working with DITA seems hair-pulling to me and I look forward to the rapid development of tools that make it easy to generate specializations for DITA.

For both DITA and DocBook, I'd also like apps to create an very simple subset menu of the DTD (NOT a subset of the elements, just an easy configurable way to exclude sets of elements from an authoring environment) to reduce the cognitive load of finding the right elements in the DTD stack.

Posted by Dorothy Hoskins on 23 Sep 2010 @ 02:24pm UTC #

I question the statement "The only difference between DITA and DocBook is specialization, and specialization is why DITA is better." What about conref and keyref? Yes DocBook has XInclude, but conref validates and keyref enables indirect addressing. How can I accomplish those things in DocBook?

Posted by Paul Eisenberg on 23 Oct 2010 @ 01:48am UTC #
Add a comment or subscribe to (existing and future) comments on this essay.
Name:
Email*:
 *Please provide your real email address; it will not be displayed as part of the comment.
Homepage:
Comment**:
 **The following markup may be used in the body of the comment: a, abbr, b, br, code, em, i, p, pre, strong, and var. You can also use character entities. Any other markup will be discarded, including all attributes (except href on a). Your tag soup will be sanitized...