<?xml version="1.0" encoding="UTF-8"?>
<essay xml:lang="en" version="5.0" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:gal="http://norman.walsh.name/rdf/gallery#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
<info>
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
<title>Implementing AtomPub</title><biblioid class="uri">http://norman.walsh.name/2009/01/23/atompub</biblioid>
<volumenum>12</volumenum>
<issuenum>3</issuenum>
<pubdate>2009-01-23T08:32:49-05:00</pubdate>
<author>
      <personname>
<firstname>Norman</firstname>
	<surname>Walsh</surname>
</personname>
    </author>
<copyright>
      <year>2009</year>
      <holder>Norman Walsh</holder>
    </copyright>
<abstract>
<para>A few weeks ago, I decided to build a conformant AtomPub server
implementation on MarkLogic Server. Mostly for fun, but partly with an eye
towards using it for some future reimplementation of this weblog. In any
event, it's up and running on my test server.</para>
</abstract>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#Atom"/>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#Programming"/>
<dc:subject rdf:resource="http://norman.walsh.name/knows/taxonomy#TheWeb"/>
</info>

<para xml:id="p1">Implementing
<link xlink:href="http://tools.ietf.org/html/rfc5023">AtomPub</link>
in
<link xlink:href="http://www.marklogic.com/product/marklogic-server.html">MarkLogic
Server</link> was a fun little project. The first 90% of the exercise took about
two days, the remaining 10% took about a week and a half. Such is the way of
fun little projects.
</para>

<para xml:id="p2">The executive summary: dead easy to implement in MarkLogic Server.
I built a flexible, conformant AtomPub server in less than a thousand lines of
<wikipedia>XQuery</wikipedia>. When I get a chance, I'll write up some documentation
for it and put it on the <link xlink:href="http://xqzone.marklogic.com/">Mark
Logic Developer Network</link>.</para>

<para xml:id="p3">The only tricky part, really, was getting the security right.
But when isn't it tricky to get security right?</para>

<para xml:id="p4">It's very convenient in a lot of applications to rely on
“application level” security<footnote>
      <para xml:id="p5">Note that I said
“convenient”. I didn't say “wise” or “best”.</para>
    </footnote>.
You give all your XQuery code full privileges to the whole system
and rely on your coding skills to manage access. This is very
flexible and convenient, but it doesn't work for AtomPub.</para>

<para xml:id="p6">AtomPub clients expect to use
<wikipedia page="Hypertext_Transfer_Protocol">HTTP</wikipedia>
authentication to gain access
to the server, so that's what you have to provide. Unlike a human user
on a web browser, where you might implement a floating, “web 2.0” style
login box (or its <wikipedia page="Web_accessibility">accessible</wikipedia>
equivalent), for a machine operating
over a wire protocol, you have to reply with and respond to the proper
HTTP authentication challenges.</para>

<para xml:id="p7">Generally speaking, what this means is that you have to provide two
URIs for each resource on the server: one URI provides read-only, public
access, the other provides authenticated read-write access.</para>

<para xml:id="p8">If you're developing on an <wikipedia page="Apache_HTTP_Server">Apache</wikipedia> server (and I assume the same
is true for a lot of other servers), it's often convenient to do this by
hacking the path component and using <filename>.htaccess</filename>
files. So, for example, <uri>http://example.com/path/to/entry</uri> is
available to anyone, and <uri>http://example.com/edit/path/to/entry</uri> is
the same entry protected by authentication.</para>

<para xml:id="p9">In the context of MarkLogic Server, the most straightforward way to
do this is with two application servers running against the same database.
You can see this in my test environment.
The server at
<link xlink:href="http://microwave.homedns.org:8600/"/> requires no authentication
but also has no priviliges to edit any files on the server. The server
at
<link xlink:href="http://microwave.homedns.org:8601/"/> requires authentication
and users who successfully authenticate have priviliges to edit their
documents.</para>

<section xml:id="how">
<title>How does it work?</title>

<para xml:id="p10">There are basically six modules plus a little ancillary code. An
incoming request is caught by the <filename>error-handler.xqy</filename> module
and dispatched appropriately by calling functions in the
<filename>atompub.xqy</filename> module. Support for HTTP PUT and POST are handled
by separate modules (with the uninspired names <filename>put.xqy</filename> and
<filename>post.xqy</filename>). This allows the actual code run for PUT and
POST to be configured on a per-feed basis, because flexibility is a good thing.
Each of these modules calls <filename>validate.xqy</filename> to determine
if the incoming content is acceptable. Again, this is a separate module for
flexibility. A <filename>format.xqy</filename> module is invoked when a GET
is made against the “alternate” link of an entry.</para>

<para xml:id="p11">Out of the box, validation and formatting are designed to work with
plain text or (X)HTML entries. One of my longer-term goals is to reimplement
this weblog on top of MarkLogic Server. When I do that, I'll customize my
weblog to validate and format the DocBook extension that I use for authoring.
</para>

<para xml:id="p12">Security wise, there's a <literal>joepublic</literal> user with the
<literal>weblog-reader</literal> role. That's the default user on the server
on port 8600. The <literal>weblog-reader</literal> role grants just enough
priviliges to run the AtomPub code.</para>

<para xml:id="p13">Each user that's created has three roles:
<literal>weblog-reader</literal>, <literal>weblog-editor</literal>,
and
<literal>weblog-editor-<replaceable>username</replaceable></literal>.
The <literal>weblog-editor</literal> role identifies the user as an
editor while the
<literal>weblog-editor-<replaceable>username</replaceable></literal> role
gives them the URI privilige necessary to write to their part of the database.
(This is what prevents you from logging in as an editor and then writing
entries in my part of the database.)</para>

<section xml:id="useradmin">
<title>User Administration</title>

<para xml:id="p14">The final detail detail, and honestly the last 10% that took took
the other 90% of the time, is the “admin” interface, such as it is.
You can create an account on the server by filling out the form on the
homepage and following the link that will be emailed to you.</para>

<para xml:id="p15">If you've really been paying attention, you'll note that these
admin tasks run on port 8600, which I earlier said had only read-only
access to the server. So how can it create new accounts?</para>

<para xml:id="p16">The answer is that the server's security API is sufficiently
powerful that I can amplify the priviliges of an individual function.
These “amped” functions allow the application author to provide additional
priviliges in a very localized fashion. So there are two functions
(<function>request-user</function> and <function>create-user</function>)
that can edit the configuration file even when run by
ordinary mortals (specifically <literal>joepublic</literal>).
(And a special thanks to <personname>
	  <firstname>Danny</firstname>
<surname>Sokolsky</surname>
	</personname> our Technical Documentation Manager
for guiding me through an embarrassingly long series of bone-headed attempts
on my part to get this working correctly.)</para>

<para xml:id="p17">If a user doesn't have a <filename>service.xml</filename>
document (and none of the users do since I haven't provided an admin
API for creating or editing one), they get a default one. The default
one has two collections, one ordinary collection and one media
collection.</para>
</section>
</section>

<section xml:id="tryitout">
<title>Give it a whirl!</title>

<para xml:id="p18">My implementation passes <personname>
	<firstname>Joe</firstname>
<surname>Gregorio</surname>
      </personname>’s
<link xlink:href="http://bitworking.org/projects/apptestclient/">APP Test
Client</link> and
<personname>
	<firstname>Tim</firstname>
	<surname>Bray</surname>
      </personname>’s
<link xlink:href="http://www.tbray.org/ape/">Atom Protocol Exerciser</link>
so I think it's ready for real world use.</para>

<para xml:id="p19">Feel free to give it a try
<link xlink:href="http://microwave.homedns.org:8600/">on Microwave</link>.
Report any problems that you encounter, naturally. It's quite possible
that I've misinterpreted parts of
<link xlink:href="http://tools.ietf.org/html/rfc5023">RFC 5023</link>.
</para>

<para xml:id="p20">Fair warning: Microwave is a spare box in my house. It's bloody noisy,
so I keep it in the hallway and I turn it off at night so that it doesn't
keep me awake. It's usually online between about 7:00a and 10:00p EST, but
I'm not making an long-term promises about it.</para>
</section>

</essay>

