notAllowed?

Volume 9, Issue 105; 27 Oct 2006; last modified 08 Oct 2010

On customizing DocBook and the importance of sometimes being optionally not allowed.

Huge numbers of people use DocBook straight out of the box, without ever customizing it or changing anything. That's hardly surprising; if you're writing technical documentation, chances are good DocBook covers everything you need. Even if there are a couple of places where some small changes would make it fit your particular needs even better, the cost of designing the schema customization, augmenting the processing tools, and training your authors, who probably already know stock DocBook, is often not worth the effort.

But suppose it is, suppose you do want to make some changes. One thing you can do that's essentially free is remove optional stuff. As long as you do that, you're making a subset and you don't have to change anything except the schema (your authoring tool is presumably driven off the schema, hopefully the RELAX NG one, and the stylesheets and processing tools don't have to be changed for a subset).

Subsetting makes the schema smaller and often simplifies authoring UIs. This is particularly true of attributes because they're often presented in some sort of drop-down list. The first set of attributes to fall under the knife when I'm doing this sort of thing are the effectivity attributes. The writing I do doesn't tend to need the profiling that they support, and removing them knocks 10 attributes off every element in one fell swoop.

In a DocBook V5.0 customization layer, it's easy to remove things, just make them notAllowed:

db.effectivity.attributes = notAllowed

Except, after you do this, validation produces some odd messages, like this one:

$ msv db.rng test.xml 
start parsing a grammar.
validating test.xml
Error at line:1, column:48 of file:///home/ndw/scratch/test.xml
  element "article" was found where no element may occur

the document is NOT valid.

Of course, the first time you encounter this, it's likely to be in a customization layer that does a bit more than lop off the effectivity attributes so it can take a while to figure out what's wrong.

But the effectivity attributes are what's wrong. Can you see why? Maybe this will help:

db.common.attributes =
   db.xml.id.attribute?
 & db.version.attribute?
 & db.xml.lang.attribute?
 & db.xml.base.attribute?
 & db.remap.attribute?
 & db.xreflabel.attribute?
 & db.revisionflag.attribute?
 & db.dir.attribute?
 & db.effectivity.attributes

See it now?

The problem is that the effectivity attributes are a required part of the common attributes (common attributes are ones that appear on every DocBook element). So when you make the effectivity attributes “notAllowed”, you've created an unsatisfiable pattern: every element must include something which is not allowed.

The consequence is that no element pattern in the schema can possibly match any DocBook element. To fix this, make the effectivity attributes optionally not allowedYou may have noticed that this could also be fixed by making the db.effectivity.attributes pattern optional in the common attributes. But if you did that, it would be much harder to create a customization layer that required one of the effectivity attributes. Not that I see a lot of point in doing that, as it happens.:

db.effectivity.attributes = notAllowed?

There:

$ msv db.rng test.xml 
start parsing a grammar.
warnings are found. use -warning switch to see all warnings.
validating test.xml
the document is valid.

That's better. (Thanks to Scott Hudson for providing the real-life example that actually brought this to my attention.)

Comments

Thanks, Norm! As always your expertise is extremely valuable. DocBook v5.0 is so easy to customize. I can't wait until we make this baby official!

—Posted by Scott Hudson on 27 Oct 2006 @ 09:09 UTC #

Hi Norm, I personally used empty pattern for "disabling" existing attributes, see http://xmlguru.cz/2006/03/notallowed-or-empty But I'm not sure which approach (notAllowed? or empty) is better.

—Posted by Jirka Kosek on 29 Oct 2006 @ 01:45 UTC #