<?xml version='1.0' encoding='utf-8'?>
<?xml-stylesheet href="/style/browser.xsl" type="text/xsl"?>
<essay xmlns="http://docbook.org/ns/docbook"
       xmlns:xlink="http://www.w3.org/1999/xlink"
       xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
       xmlns:dc='http://purl.org/dc/elements/1.1/'
       xmlns:dcterms="http://purl.org/dc/terms/"
       xmlns:gal='http://norman.walsh.name/rdf/gallery#'
       version="pto">
<info>
<title>Caching in with Resolvers</title>
<volumenum>6</volumenum>
<issuenum>31</issuenum>
<pubdate>2003-06-05</pubdate>
<date>$Date: 2007-02-05 14:06:50 -0500 (Mon, 05 Feb 2007) $</date>
<author><personname>
<firstname>Norman</firstname><surname>Walsh</surname>
</personname></author>
<copyright><year>2003</year><holder>Norman Walsh</holder></copyright>
<abstract>
<para>XML Catalogs is now a Committee Specification. We're well on our way
to OASIS Standard, I think, and that means it's time to get your deployment
strategies in order.</para>
</abstract>
</info>
<epigraph>
<attribution>Emerson</attribution>
<para xml:id='p1'>Everything in the universe goes by indirection.
There are no straight lines.</para>
</epigraph>

<para xml:id='p2'>On Tuesday afternoon, moments before I darted out of town for a
couple of days, the we voted to take the current <citetitle>XML
Catalogs</citetitle> draft to
<link xlink:href="http://www.oasis-open.org/committees/process.php#approval_spec">Committee
Specification</link>. This is almost the penultimate step to
become an OASIS Standard (voting membership willing, of course). If
you've been using catalogs, that means it's time to get your
deployment strategies in order. If you haven't, that means it's high
time you did!</para>

<para xml:id='p3'><link xlink:href="http://www.oasis-open.org/committees/download.php/2384/cs-entity-xml-catalogs-1.0.html">XML Catalogs</link> is the principle work product of the
<link xlink:href="http://www.oasis-open.org/">OASIS</link>
<link xlink:href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=entity">Entity
Resolution Technical Committee</link>, chaired by the inimitable
<personname><firstname>Lauren</firstname><surname>Wood</surname></personname>,
and for which I am the humble editor.</para>

<para xml:id='p4'>The <quote>elevator speech</quote> for
catalogs goes like this: all sorts of critical resources are identified by URI these
days<footnote><para xml:id='p5'>That's a good thing! I wish we had public identifiers
for them as well; but that's a topic for another essay.</para></footnote>
(schemas, stylesheets, DTDs, RDF grammars, etc.). As long as you're connected
to the net, everything works perfectly. But what happens when you're disconnected,
either because the part of the net is down or because you've unplugged your laptop
and taken it to 30,000 feet for some transcontinental journey? Suddenly, the fact
that you need <filename>http://www.example.com/some/uri</filename> to get your work
done is a frustrating complication. The problem often isn't that you don't have the
relevant document locally, the problem is that your system identifers, schema location
attributes, and what have you all point to the web. Sure, you could change them all
to point to the local copies, but then it's a pain to share documents with colleagues
and you have to do this hacking over and over again. What you need is a transparent
way of remapping those identifiers to the appropriate local copies. And that's
exactly what XML Catalogs<indexterm><primary>XML Catalogs</primary>
</indexterm><indexterm><primary>Catalogs</primary><secondary>XML</secondary>
<see>XML Catalogs</see></indexterm> give you:
a flexible, transparent mapping of resources.
</para>

<para xml:id='p6'>And the really good news: XML Catalogs are already
<link xlink:href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=entity#tools">widely
supported</link>. For example, almost all Java tools can use the
resolver classes from the
Apache<indexterm><primary>Apache</primary></indexterm> <quote>XML
Commons</quote><indexterm><primary>Apache</primary><secondary>XML
Commons</secondary></indexterm> project and all
Gnome<indexterm><primary>Gnome</primary></indexterm>
<quote>libxml</quote>-based<indexterm><primary>Gnome</primary>
<secondary>libxml</secondary></indexterm> tools support them.</para>

<para xml:id='p7'>Let's look at two common scenarios:</para>

<section xml:id='s1'>
<title>Resolving Standard Resources</title>

<para xml:id='p8'>Suppose you have a few hundred
<application>DocBook</application> documents lying around. You'd never
dream of working with documents that weren't validated (you wouldn't,
would you!?), so they all begin with the relevant document type
declaration:</para>

<programlisting><![CDATA[<!DOCTYPE book
  PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">]]></programlisting>

<para xml:id='p9'>Everytime you parse one of these documents, the parser goes out to
the web and drags the <emphasis>whole</emphasis> DocBook DTD down from
the OASIS site. Even with a fast connection, that probably takes a few seconds.
And like I said, it only works at all if you can
actually <emphasis>get</emphasis> to the OASIS web site.</para>

<para xml:id='p10'>All that downloading is probably entirely unnecessary. If you're editing
that many DocBook documents, it's likely that
you've got a copy of the DTD on your machine. For the sake of argument, let's
say it's stored in <filename>/docbook/4.2/</filename>.</para>

<para xml:id='p11'>That means if you setup the following catalog<footnote><para xml:id='p12'>In practice,
you probably want to map the system identifier as well, and maybe a few other
things, like the entity sets, but for the sake of simplicity (and to keep
the code listing narrow) I'm omitting those for the time being.</para></footnote>:</para>

<programlisting><![CDATA[<catalog
  xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
  prefer="public">

  <public publicId="-//OASIS//DTD DocBook XML V4.2//EN"
          uri="/docbook/4.2/docbookx.dtd"/>
</catalog>]]></programlisting>

<para xml:id='p13'>and point your applications at it, all those net accesses for the DocBook
DTD will transparently vanish. Everything will run faster, and it'll run just
as well at 30,000 feet as it will plugged into the net on your desk.</para>
</section>

<section xml:id='s2'>
<title>Resolving A Development Resource</title>

<para xml:id='p14'>Catalogs can be really convenient not only for stable resources,
but for development resources as well. Here's an example that I use
everyday.</para>

<para xml:id='p15'>I work on a large set of stylesheets for DocBook. I extend and
customize the base stylesheets with fair regularity. The XML Catalogs
specification, for example, is generated with a customization on top
of the base DocBook stylesheets.</para>

<para xml:id='p16'>That means I have a lot of stylesheets that begin like this:</para>

<programlisting><![CDATA[<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

<xsl:import
  href="http://docbook.sourceforge.net/.../docbook.xsl"/>

  ...]]></programlisting>

<para xml:id='p17'>They start by importing the base DocBook stylesheets from their public
location (though I've abbreviated the URI in the example). The problem is,
when I'm fixing bugs or doing development work, I don't really want to get
the public version, I want to get my local development version.</para>

<para xml:id='p18'>I could change all the <tag class="attribute">href</tag> attributes
to point to the local copies, but that'd be a maintainance nightmare.
If they pointed to local copies, they wouldn't work for anyone but me, so
I'd have to remember to make them all point to the public location before
each release.</para>

<para xml:id='p19'>The solution is a <quote>development catalog</quote>:</para>

<programlisting><![CDATA[<catalog
  xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
  prefer="public">

  <uri name="http://docbook.sf.net/.../docbook.xsl"
       uri="/xsl/docbook/html/docbook.xsl"/>
</catalog>]]></programlisting>

<para xml:id='p20'>This catalog maps the public URI to my local development copy.
By using that catalog, I get to pretend that I've published my local
version. (Every access to the public version is transparently mapped to my
local development copy.)</para>

<para xml:id='p21'>Catalogs solve problems for me everyday. They can solve problems
for you too.
</para>

</section>
</essay>
