Everything in the universe goes by indirection. There are no straight lines.
On Tuesday afternoon, moments before I darted out of town for a couple of days, the we voted to take the current XML Catalogs draft to Committee Specification. This is almost the penultimate step to become an OASIS Standard (voting membership willing, of course). If you've been using catalogs, that means it's time to get your deployment strategies in order. If you haven't, that means it's high time you did!
The “elevator speech” for
catalogs goes like this: all sorts of critical resources are identified by URI these
daysThat's a good thing! I wish we had public identifiers
for them as well; but that's a topic for another essay.
(schemas, stylesheets, DTDs, RDF grammars, etc.). As long as you're connected
to the net, everything works perfectly. But what happens when you're disconnected,
either because the part of the net is down or because you've unplugged your laptop
and taken it to 30,000 feet for some transcontinental journey? Suddenly, the fact
that you need
http://www.example.com/some/uri to get your work
done is a frustrating complication. The problem often isn't that you don't have the
relevant document locally, the problem is that your system identifers, schema location
attributes, and what have you all point to the web. Sure, you could change them all
to point to the local copies, but then it's a pain to share documents with colleagues
and you have to do this hacking over and over again. What you need is a transparent
way of remapping those identifiers to the appropriate local copies. And that's
exactly what XML Catalogs give you:
a flexible, transparent mapping of resources.
And the really good news: XML Catalogs are already widely supported. For example, almost all Java tools can use the resolver classes from the Apache “XML Commons” project and all Gnome “libxml”-based tools support them.
Let's look at two common scenarios:
Resolving Standard Resources
Suppose you have a few hundred DocBook documents lying around. You'd never dream of working with documents that weren't validated (you wouldn't, would you!?), so they all begin with the relevant document type declaration:
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
Everytime you parse one of these documents, the parser goes out to the web and drags the whole DocBook DTD down from the OASIS site. Even with a fast connection, that probably takes a few seconds. And like I said, it only works at all if you can actually get to the OASIS web site.
All that downloading is probably entirely unnecessary. If you're editing
that many DocBook documents, it's likely that
you've got a copy of the DTD on your machine. For the sake of argument, let's
say it's stored in
That means if you setup the following catalogIn practice, you probably want to map the system identifier as well, and maybe a few other things, like the entity sets, but for the sake of simplicity (and to keep the code listing narrow) I'm omitting those for the time being.:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer="public"> <public publicId="-//OASIS//DTD DocBook XML V4.2//EN" uri="/docbook/4.2/docbookx.dtd"/> </catalog>
and point your applications at it, all those net accesses for the DocBook DTD will transparently vanish. Everything will run faster, and it'll run just as well at 30,000 feet as it will plugged into the net on your desk.
Resolving A Development Resource
Catalogs can be really convenient not only for stable resources, but for development resources as well. Here's an example that I use everyday.
I work on a large set of stylesheets for DocBook. I extend and customize the base stylesheets with fair regularity. The XML Catalogs specification, for example, is generated with a customization on top of the base DocBook stylesheets.
That means I have a lot of stylesheets that begin like this:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:import href="http://docbook.sourceforge.net/.../docbook.xsl"/> ...
They start by importing the base DocBook stylesheets from their public location (though I've abbreviated the URI in the example). The problem is, when I'm fixing bugs or doing development work, I don't really want to get the public version, I want to get my local development version.
I could change all the
to point to the local copies, but that'd be a maintainance nightmare.
If they pointed to local copies, they wouldn't work for anyone but me, so
I'd have to remember to make them all point to the public location before
The solution is a “development catalog”:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer="public"> <uri name="http://docbook.sf.net/.../docbook.xsl" uri="/xsl/docbook/html/docbook.xsl"/> </catalog>
This catalog maps the public URI to my local development copy. By using that catalog, I get to pretend that I've published my local version. (Every access to the public version is transparently mapped to my local development copy.)
Catalogs solve problems for me everyday. They can solve problems for you too.