DocBook XSD!

Volume 14, Issue 54; 22 Dec 2011

I believe that I've resolved the ambiguity problems evident in the previous release.

Armed with a recent Xerces, I was able to diagnose and resolve the ambiguity problems present in the previous release. At least, I was able to satisfy Xerces which has a reputation for being quite picky. The ambiguity about ambiguity still exists in the spec, so this could be wrong as well.

Remarkably, the changes necessary to remove ambiguity were not that onerous.

  • In DocBook, the db:link group has been removed from db:general.inlines. Because db:link is also a member of db:ubiq.inlines, it was causing a UPA violation. This change shouldn't have any impact on the set of documents that are considered valid.

  • In the XSD 1.0 version of Publishers, the Dublin Core elements have been removed from the info element. They cause a UPA conflict with the xs:any that is (supposed) to allow metadata from other namespaces. The process contents setting for “any” has been changed from “skip” to “lax” so that Dublin Core elements (if present) will still be validated against their schema types. I think.

    I didn't change the XSD 1.1 schema because (again, I think) the rules for UPA have been softened in this case in XSD 1.1

As a result, I think I now have XML Schemas for DocBook and Publishers. If your favorite tool balks on one of these, please let me know. Likewise, if you encounter documents that you think are valid DocBook V5.0 documents that these schemas reject, please let me know.

For the sake of completeness, I'd also be interested in documents that are not valid DocBook V5.0 documents but that these schemas accept. Note, however, that these schemas are not normative and they will not detect some valididity problems. So such documents aren't necessarily a bug.

Comments

Be aware that Xerces-C has many bugs, and that may people cannot handle XSDs that X-C cannot process. I've run into this perhaps 3-4 times over the last year, where Trang generates XSDs that are perfectly valid but can't be handled by X-C. (XML Spy used to have this reputation, but I don't know if it's still true or not.)

—Posted by John Cowan on 22 Dec 2011 @ 08:58 UTC #

I meant Xerces-J. I've never used Xerces-C.

—Posted by Norman Walsh on 22 Dec 2011 @ 10:13 UTC #