Let me say right up front that I'm an object oriented kinda guy. I have faith in the paradigm, I've drunk the kool-aid. I'm absolutely not suggesting that XML applications shouldn't be written using an object oriented style. To the contrary, most of the XML applications that I've written are object oriented from top to bottom.
What I am saying is that the constituent elements and attributes of an XML vocabulary are not generally related to each other by inheritance, nor do they naturally correspond to objects with any kind of precision.
I'm well aware that there are many applications where there is a natural correspondence between an object graph and an XML serialization of that graph. And there are really good tools like JAXB for effectively and efficiently marshaling and unmarshaling data in those cases. What I'm saying is that it's not generally true.
In particular, vocabularies like DocBook that are predominantly mixed content, are designed for semantic markup of human readable text, and need to provide considerable flexibility for customization by end users, should not be modeled as if there was some inheritance relationship between the elements or as if one element was derived by some sort of extension or restriction from some other element.
Elements Aren't Derived
A class in an object oriented language can be thought of as defining a chunk of data that consists of a list of fields (property/value pairs) and a list of methods for accessing and manipulating those fields. (At this level of abstraction, we can ignore the details of encapsulation that have to do with the accessibility of fields and methods.)
When one class extends another it can add new fields and new methods (and it may be able to redefine the function body of existing methods), but it doesn't remove existing fields or radically alter the internal structure of the object.
The point being that a piece of code that knows how to handle an object of class X will automatically be able to handle objects of class X' (derived from X). There may be additional fields and methods provided by X' that are unknown, but that won't have any impact on the code that's only using the fields and methods defined in X.
In XML, things that sound like derivation often aren't. Consider paragraphs and formal paragraphs:
<para xml:id='p10'>This is a paragraph</para> <formalpara> <title>Paragraph Title</title> <para xml:id='p11'>This is a formal paragraph.</para> </formalpara>
If I told you that a formal paragraph was a paragraph with a title, you might imagine that a formal paragraph was an extension of a “normal” paragraph. But closer inspection reveals that it doesn't work that way: a formal paragraph contains a paragraph and a title (it's an aggregate), it doesn't extend the original paragraph.
Similarly, you might imagine that the various sorts of lists are all derived from some common type, but that doesn't work either. In DocBook, for example, ordered and enumerated lists are similar but variable lists and segmented lists bear only a vague structural similarity.
What's more, even in cases where the structural similarities would make derivation sensible to the original designers, customizers often want to make changes that break the pattern. You might, for example, have chosen to derive itemized and ordered lists from some common supertype, but if a customizer wants to remove an attribute from ordered lists, they'll be breaking the derivation.
It just doesn't work.
Content Models Aren't Inherited
In general, most content models in vocabularies like DocBook are “bags of stuff”. There just isn't any sensible inheritance model for them. There are almost no elements about which you can say, “oh it's just like this other element except that it allows a couple of new elements”. And when you can say that, it's probably not the point.
And what I said before about customizers wanting to change things in ways that don't suit an inheritance model you might have concocted, applies in spades to content models. There's almost no change that someone won't need to make. Sometimes customizers want to attack your content models with a machette, and sometimes with a glue gun.
You might argue that if it was all done right, the inheritance model would make the customizers job easy. That, in fact, it's failure to do so that brings out the machettes and the glue guns. But I don't think that's true.
Customizers change content models to suit real business needs. And those needs aren't likely to fit neatly into your design. Especially when the customizers are adapting your vocabulary to uses you never imagined.
And they will. If you let them.