<?xml version='1.0' encoding='utf-8'?>
<?xml-stylesheet href="/style/browser.xsl" type="text/xsl"?>
<essay xmlns="http://docbook.org/ns/docbook"
       xmlns:xlink="http://www.w3.org/1999/xlink"
       xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
       xmlns:dc='http://purl.org/dc/elements/1.1/'
       xmlns:dcterms="http://purl.org/dc/terms/"
       xmlns:gal='http://norman.walsh.name/rdf/gallery#'
       xml:lang="en"
       version='ipa'>
<info>
<title>WITW: NSDL</title>
<volumenum>8</volumenum>
<issuenum>40</issuenum>
<pubdate>2005-03-12T09:02:24-05:00</pubdate>
<date>$Date: 2005-06-22 06:14:24 -0400 (Wed, 22 Jun 2005) $</date>
<author><personname>
<firstname>Norman</firstname><surname>Walsh</surname>
</personname></author>
<copyright><year>2005</year><holder>Norman Walsh</holder></copyright>
<abstract>
<para>Norm's Service Description Language (staggeringly original name,
I know) is my experiment with a simpler web services description
language.
</para>
</abstract>
</info>

<epigraph>
<attribution><personname><firstname>Thomas</firstname>
<surname>Paine</surname></personname></attribution>
<para xml:id='p2'>Time makes more converts than reason.</para>
</epigraph>

<para xml:id='p1'>Back when <link xlink:href="/2005/02/24/wsdl">WSDL
defeated me</link>, I realized even in my defeat that some sort of
description language was necessary. It must be possible to describe
services so that compilers can build interfaces to them, that's the
only way to make them accessible to “ordinary programmers” who don't
care about web services for web services sake.
</para>

<para xml:id='p3'>There seem to be two main requirements:</para>

<orderedlist>
<listitem>
<para xml:id='p4'>Make it possible for ordinary programmers to use web
services as transparently as they use other code libraries.</para>
</listitem>
<listitem>
<para xml:id='p5'>Make it possible for ordinary service providers to
describe their interfaces in a standard way so that some level of
interoperability can be achieved.</para>
</listitem>
</orderedlist>

<para xml:id='p6'>A little web searching will reveal that I'm not the
first to have this idea. And there may be existing “off the shelf”
solutions that already satisfy those requirements. But <emphasis>Where
in the World</emphasis> isn't about the getting done, it's about the
doing. To that end, I decided to see if I could tackle the problem, if
I could not only describe a solution, but <emphasis>build</emphasis>
it too.</para>

<para xml:id='p7'>Going back to
<link xlink:href="/2005/02/16/witw-part-1">my roots</link>, I decided
that I'd attempt to describe services that are directly accessible via
GET or POST over HTTP. That means no fancy binding specifications,
abstract port descriptions, arbitrary intermediaries, or who knows
what else. I've got no hope of getting all that stuff right before
someone can explain why it's actually needed anyway. (I won't attempt
to dispute with any authority that it <emphasis>is</emphasis> needed,
but <emphasis>I</emphasis> don't need it and I don't understand
it.)</para>

<section xml:id="sketch">
<title>Sketching a service description</title>

<para xml:id='p8'>Although, in the modern style, Perl and Python
functions often take named parameters, I think positional parameters
are still the most natural to most programmers. For the HTTP GET case
then, I think this reduces the problem to one of mapping positional
parameters on a function invocation to named parameters on an HTTP
URI.</para>

<para xml:id='p9'>The programmer's use of <code>user('ndw')</code> has
to be translated to an HTTP GET of
<literal>http://norman.walsh.name/2005/02/witw/is?userid=ndw</literal>
and then some part of the result has to be returned as a scalar value.
</para>

<para xml:id='p10'>Here's how I describe that in NSDL. First, the
service:</para>

<programlisting><![CDATA[
<service name="user"
	 action="get"
	 uri="http://norman.walsh.name/2005/02/witw/is?">]]></programlisting>

<para xml:id='p11'>The service defines a method named
<function>user</function>, is invoked with an HTTP GET, and has the
URI specified. Next, the parameters that this
service can have must be identified:</para>

<programlisting><![CDATA[  <request>
    <parameter name="userid" type="xsd:string"/>
    <parameter name="nearby?" type="xsd:integer" default='0' optional="yes"/>
  </request>]]></programlisting>

<para xml:id='p12'>The positional parameters in the method invocation
get mapped to the list of parameters in the request block. In this
case, the first parameter is the value of
<parameter>userid</parameter>. The second, optional, parameter
is the value
of <parameter>nearby</parameter>. If it isn't specified, it will
default to 0.</para>

<para xml:id='p13'>Finally, something has to be returned. That's
identified in the response:</para>

<programlisting><![CDATA[  <response>
    <result select="/is:is/is:user/is:name"/>
  </response>]]></programlisting>

<para xml:id='p14'>If all goes well, the value returned by this method
will be the value of that XPath expression given in the <tag
class='attribute'>select</tag> attribute as applied to the document
returned by the service.</para>

<para xml:id='p15'>But what if something goes wrong? What if the
service doesn't return the expected value? The <tag>response</tag> can
be augmented to look for errors:</para>

<programlisting><![CDATA[  <response>
    <fault name="baduserid" select="//is:unknown-user"/>
    <fault name="invalid" select="//is:invalid-request"/>
    <result select="/is:is/is:user/is:name"/>
  </response>]]></programlisting>

<para xml:id='p16'>Now the service will “fault” with a “baduserid” or
“invalid” code if either of those XPath expressions matches the
result. (Fault handling isn't the strongest suit of my implementation,
I admit.)</para>

</section>
<section xml:id="typing">
<title>Parameter typing</title>

<para xml:id='p17'>If you're observant and have a good memory, you may
have noticed two things about parameter types: first, that they're
defined using W3C XML Schema data types and second, that the type of
<parameter>nearby</parameter> is wrong. The lexical space of
<parameter>nearby</parameter> should be limited to exactly “0” or
“1”.</para>

<para xml:id='p18'>With respect to the first observation, you're
absolutely right. But I'm actually accomplishing this with RELAX NG.
Partly, I admit, out of a desire to prove that RELAX NG is as
reasonable a validation technology for web services as any other. But
also partly because <systemitem class="library">libxml</systemitem>
provides a RELAX NG validator.
</para>

<para xml:id='p19'>You're absolutely right about the second
observation, too, but that can be fixed now. First, add a new section
to the service description file that defines the additional
type<footnote><para xml:id='p20'>Yes, “type” is a misnomer. It'd more
properly be called a “pattern” in RELAX NG parlance. Humor me,
ok?</para></footnote>:</para>

<programlisting><![CDATA[<types xmlns:rng="http://relaxng.org/ns/structure/1.0">
  <rng:define name="DigitBoolean">
    <rng:choice>
      <rng:value>0</rng:value>
      <rng:value>1</rng:value>
    </rng:choice>
  </rng:define>
</types>]]></programlisting>

<para xml:id='p21'>Then change the type of the request parameter:</para>

<programlisting><![CDATA[  <request>
    <parameter name="userid" type="xsd:string"/>
    <parameter name="nearby" type="DigitBoolean" default='0' optional="yes"/>
  </request>]]></programlisting>

<para xml:id='p22'>Now the values are properly constrained. This is
probably a good place to note that I could have added type checking to
the results as well. It'd be pretty straight-forward to add a type
attribute and check the results using the same technique I'm using to
check the parameters, but I didn't bother. I wouldn't learn anything
new from the exercise.</para>
</section>

<section xml:id="multresult">
<title>Multiple results</title>

<para xml:id='p23'>Sometimes it's convenient for a single web service invocation to
return multiple results. The same GET that will return the user
name from WITW also returns the latitude, longitude, date, mailbox,
and a host of other information. Rather than requiring that the
service provider decompose the service into individual methods,
a service can return multiple results:</para>

<programlisting><![CDATA[  <response>
    <fault name="baduserid" select="//is:unknown-user"/>
    <fault name="invalid" select="//is:invalid-request"/>

    <result name="name" select="/is:is/is:user/is:name"/>
    <result name="userid" select="/is:is/is:user/@userid"/>
    <result name="uri" select="/is:is/is:user/is:uri"/>
    <result xmlns:foaf="http://xmlns.com/foaf/0.1/"
	    name="mailbox" select="/is:is/is:user/foaf:mbox_sha1sum"/>
    <result name="lat" select="/is:is/is:locations/is:location/@lat"/>
    <result name="long" select="/is:is/is:locations/is:location/@long"/>
    <result name="date" select="/is:is/is:locations/is:location/@date"/>
  </response>]]></programlisting>

<para xml:id='p24'>That doesn't actually tell the implementation how
to provide access to those results, but that's going to have to vary
on a per-implementation-language basis anyay. For my implementation,
I'm going to return a “response object” that will have access methods
for those named results.</para>

<para xml:id='p25'>Speaking of multiple results, what should we do
about XPath expressions that select multiple nodes? Suppose, for
example, that we wanted to return all the landmarks?</para>

<para xml:id='p26'>I thought about this and decided to punt a bit.
First, it seems to me that even though we're hiding the web services
aspect of this library, we don't need to make it impossible to access.
So if you need to get the XML, to extract complex results, that should
be possible. Then for multiple nodes, I decided that the easiest thing
to do was return an array of results, with each result being the
string value of the selected node. It's not perfect, but it'll do for
now. For dynamic languages like Perl, anyway, for statically typed
languages, I think a different approach would be required.</para>

</section>
<section xml:id="post">
<title>What about POST?</title>

<para xml:id='p27'>So far, all the examples use GET, which just uses
URL-encoded parameters. What about supporting POST, were there will
need to be some sort of message body? To do that, I added a
<tag>body</tag> element to the request. Here's the request block for
the “<link xlink:href="/2005/02/16/witw-part-1#p6">where am I now</link>”
service that use POST to update my position:</para>

<programlisting><![CDATA[  <request>
    <parameter name="lat" type="Latitude"/>
    <parameter name="long" type="Longitude"/>

    <body>
      <location xmlns="http://nwalsh.com/xmlns/witw-post#">
	<latlong>
	  <lat>{$lat}</lat>
	  <long>{$long}</long>
	</latlong>
      </location>
    </body>
  </request>]]></programlisting>

<para xml:id='p28'>As you can probably guess, the contents of the
<tag>body</tag> is sent in the POST, subject to an XSLT- or ant-style
“value template” expansion.</para>

<para xml:id='p42'>A complete RELAX NG Grammar
<link xlink:href="examples/nsdl.rnc">for NSDL</link> is available.</para>
</section>

<section xml:id="code">
<title>Show me the code</title>

<para xml:id='p29'>Service description, parameters, results, XML,
blah, blah, blah. Show me the code! Fair enough. My implementation is
in Perl and consists of three modules,
<link xlink:href="examples/Request.pm">NSDL::Request</link>,
<link xlink:href="examples/Response.pm">NSDL::Response</link>, and
<link xlink:href="examples/UA.pm">NSDL::UA</link> (for authentication).
</para>

<para xml:id='p30'>Here's <link xlink:href="examples/witw-user.pl">a
program</link> that uses <link xlink:href="examples/witw.nsd">the
service description</link> outlined above to print the name of any
user from WITW:</para>

<programlisting><![CDATA[#!/usr/bin/perl -w -- # -*- Perl -*-

use NSDL::Request;

my $userid = shift @ARGV
    || die "Usage: $0 userid\n";

my $req = new NSDL::Request();
$req->load('witw.nsd');

my $res = $req->user($userid);
print "$userid is $res\n";]]></programlisting>

<para xml:id='p31'>I think that satisfies the first requirement. With a little
code generation, I could simplify it further, removing the call to
“<function>load</function>” and making a class specifically for the
WITW services, but I'm not going to bother.</para>

<para xml:id='p32'>Taking advantage of the service description that returns
multiple results, it can be written
<link xlink:href="examples/witw-user2.pl">this way</link>:</para>

<programlisting><![CDATA[#!/usr/bin/perl -w -- # -*- Perl -*-

use NSDL::Request;

my $userid = shift @ARGV
    || die "Usage: $0 userid\n";

my $req = new NSDL::Request();
$req->load('witw.nsd');

my $res = $req->user($userid);
print "$userid is ", $res->name();
print " (", $res->mailbox(), ").\n";
print "Last seen on ", $res->date(), "\n";
print "at (";
print $res->lat(), ", ", $res->long();
print ")\n";]]></programlisting>

<para xml:id='p33'>Which produces results like this:</para>

<screen>ndw is Norman Walsh (9f5c771a25733700b2f96af4f8e6f35c9b0ad327).
Last seen on 2005-03-09T14:23:41Z
at (42.3382, -72.4500)</screen>

<para xml:id='p34'>Updating my location is just
<link xlink:href="examples/witw-user3.pl">as easy</link>:</para>

<programlisting><![CDATA[#!/usr/bin/perl -w -- # -*- Perl -*-

use NSDL::Request;

my $userid = shift @ARGV;
my $passwd = shift @ARGV;
my $lat = shift @ARGV;
my $long = shift @ARGV;

my $req = new NSDL::Request();
$req->load('witw.nsd');

$req->auth($userid, $passwd);
my $res = $req->ami($lat, $long);]]></programlisting>

<para xml:id='p35'>Though in this case I have to provide authentication information
so that the POST will succeed (and I haven't bothered with any error
checking).</para>
</section>

<section xml:id="implementation">
<title>Implementation Details</title>

<para xml:id='p36'>In the course of building the implementation, I've
tried to make it as self-contained and portable as possible. I found
that the Perl interfaces to <systemitem
class="library">libxml</systemitem>, specifically
<package>XML::LibXML</package> and
<package>XML::LibXML::XPathContext</package> provided almost
everything I needed. The only other external dependencies are to
<package>LWP::UserAgent</package> for HTTP support and
<package>IO::Scalar</package> for some lazy string construction with
print statements.</para>

<para xml:id='p37'>As an aside, I'm particularly impressed with the
<package>XML::LibXML</package> family of packages. They're likely to become
my new standards for working with XML in Perl. You get DOM, RELAX NG
validation, and XPath support all in one. Nice work!</para>
</section>

<section xml:id="amazon">
<title>One More Example</title>

<para xml:id='p38'>Yeah, yeah, all well and good, you can write simple programs to
access a toy web service. What about the real world? Ok, how about
using NSDL to access
<link xlink:href="http://www.amazon.com/">Amazon</link>?</para>

<para xml:id='p39'>With
<link xlink:href="examples/amazon.nsd">an appropriate description</link>,
we can write
<link xlink:href="examples/amazon.pl">a short program</link> to
access Amazon books by author:</para>

<programlisting><![CDATA[#!/usr/bin/perl -w -- # -*- Perl -*-

use NSDL::Request;

my $usage = "$0 amazonid author\n";

my $amazonid = shift @ARGV || die $usage;
my $author = shift @ARGV || die $usage;

my $req = new NSDL::Request();
$req->load('amazon.nsd');

my $res = $req->booksbyauthor($amazonid, $author);

printf "Amazon query returned %d results in %1.2fs:\n",
    $res->count(), $res->time();

my $titles = $res->titles();
if (ref $titles) {
    my $count = 1;
    foreach my $title (@{$titles}) {
	print "\t$count. $title\n";
	$count++;
    }
} else {
    print "\t$titles\n";
}]]></programlisting>

<para xml:id='p40'>If you ask for books by Norman Walsh today, you get:</para>

<screen>Amazon query returned 5 results in 0.07s:
        1. DocBook: The Definitive Guide (O'Reilly XML)
        2. Forensic Nursing and Mental Disorder in Clinical Practice
        3. Agent-Mediated Electronic Commerce IV. Designing Mechanisms and Systems : AAMAS 2002 Workshop on Agent Mediated Electronic Commerce, Bologna, Italy, J ... e / Lecture Notes in Artificial Intelligence)
        4. Docbook la reference
        5. Making TeX Work (A Nutshell Handbook)</screen>

<para xml:id='p41'>There. (And three out of five ain't bad, I don't think.)
I'm not going to think to hard about the fact that this search turns up
DocBook, electronic commerce, and mental disorder.</para>

<para xml:id='p43'>I've satisfied my own curiosity about a simpler web
services description language. And the implementation, though
definitely no more robust than a “proof of concept” wasn't that hard
to cook up. Pointers to where I've gone totally off the rails are most
welcome.</para>
</section>

</essay>
