Is RDF/XML Good for Anything?

Volume 7, Issue 136; 30 Jul 2004; last modified 08 Oct 2010

Having a standard transfer syntax for RDF is great. XML is an ideal format for this sort of “core dump”: it’s amenable to machine processing and it’s possible for a human being (with sufficient skill, experience, and dedication) to look at it in a text editor and “figure it out”. So RDF/XML is good for RDF core dumps. But is it something users should be writing by hand? I’m not sure.

The title, I admit, is inflammatory, but I’m serious about the question. I’ve been thinking about it ever since I read Dorothea Salo’s double-barrelled response to Stefano Mazzocchi’s guide to semantic web specs.

Dorothea’s rant is both fun to read and absolutely right on.

Chances are, you already know this, but I’ll say it again anyway: XML is designed to describe trees. Nested markup does this with reasonable efficiency in exactly one way“Exactly” is an over-statement; there’s some variation because attributes are unordered and there’s a small amount of variability in the syntax, but there’s essentially one XML document that represents any given tree.. RDF is a collection of (subject, predicate, value) triples that generally speaking form a graph. RDF/XML is a transfer syntax for graphs. There are lots of ways to “flatten” a graph into a tree. There will always be significant variation in the possible RDF/XML serializations of an RDF graph.

Having a standard transfer syntax is great. The fact, for example, that the PSVI doesn’t have one, is a common source of irritation. XML is a great transfer syntax for RDF. XML is an ideal format for this sort of “core dump”: it’s amenable to machine processing and it’s possible for a human being (with sufficient skill, experience, and dedication) to look at it in a text editor and “figure it out”.

So RDF/XML is good for RDF core dumps.

But is it something users should be writing by hand? I’m not sure. And with impeccable timing, Edd Dumbill enters the scene at this point and announces DOAP.

DOAP is an RDF vocabulary for describing metadata about projects. I have lots of projects, maintaining the metadata about them (web pages, syndication feeds, freshmeat announcements, CVS tags, email announcements, etc.) is tedious and error-prone. Having a standard way to represent this data is a fabulous idea.

So how should we store this metadata?

In RDF/XML

One way would be to store the data directly in RDF/XML or some other RDF transfer syntax. That’s (too) flexible and hard to validate. Besides, I’m already suspicious that RDF/XML is for core dumps.

In a DOAP XML Format

Edd has taken a stab at making DOAP more palatable to the XML crowd by providing a RELAX NG grammar for DOAP files. That’s cool. Now he’s got an XML format that just happens to be isomorphic to one possible RDF/XML serialization. Does that really count? Yes, I think it does. How can I argue that it doesn’t? It’s an XML format that I can edit with my normal RELAX NG-aware editors.

But I’m leaning towards making “announcement” essays for my projects, so having this information in a separate file doesn’t seem right. Duplication of information is bad.

As Metadata in an Essay

My next idea was to make the DOAP vocabulary a metadata vocabulary to put in the essay’s info element, just like I currently allow Dublin Core terms in there.

I implemented that, and it worked, but it didn’t really solve the duplication of information problem.

As Data in an Essay

At this point, I realized that I was going about this backwards. An RDF-primary focus isn’t a very XML approach to the problem. What I should do is put the information in the body of the essay with enough markup to identify it.

I implemented that too. I ended up with a bunch of role attributes on phrases and links to achieve it. It’s not the most attractive markup, but I haven’t thought deeply yet about what the right markup is. (One interesting exception is the doap:license element, which I left in the info. I can’t think of an element that has the right semantics: preserve the link, but don’t display the URI or make the link “hot”.)

One thing that did occur to me, and that I think will occur to a lot of you, is putting the DOAP markup right in the essay. Instead of saying “<phrase role="doap.name">name</phrase>”, say “<doap:name>name</doap:name>”. I see three problems with that approach: first, it doesn’t work for all the DOAP structures because some of them are nested; second, I’d have to define the prose processing expectations for all those elements; and third, everyone looking at my XML would have to understand all those elements, using standard elements with roles makes document interchange easier.

I’m not done thinking about this issue yet, but this little case study supports a direction that’s starting to feel right to me: RDF is a good tool for aggregating and analyzing data, but it’s not the right tool for creating or maintaining information. In a sense, (some of) the RDF community are already leaning this way too, with proposals like GRDDL being developed to define standard ways for extracting RDF from data that’s richly marked but not directly encoded in RDF/XML.

But for the record, the fact that I have to embed RDF/XML in comments in XHTML still sucks.

Comments

RSS 1.0 also has a restricted RDF/XML serialization; more due to backward compatibility with RSS 0.9 than for readability, but it was also noted at the time that it improved readability and non-RDF processing.

—Posted by Ken MacLeod on 31 Jul 2004 @ 12:08 UTC #

Some more thoughts at http://jtauber.com/blog/2004/08/06/more_on_xml_and_rdf

—Posted by James Tauber on 06 Aug 2004 @ 01:10 UTC #

I hate to say it, but I have to admit that Norman has a point here. (But let nobody touch my precious RDF >:@) No matter how much I like RDF (and OWL in particular) to add semantics the things I produce, it is definitely a nightmare to write it by hand.

In my attempt to get my head around it, I stumbled into WEESA. Definitely worth reading:

http://server2.tecweb.inf.puc-rio.br/we-sw-2004/www2004-weesa.pdf

—Posted by Wilfred Springer on 04 Nov 2004 @ 01:40 UTC #

Hey guys I'm getting a bit off the topic here but I got a little problem :). I've traslated into Polish the W3C spec concering RDF and I would like to get a feedback from someone speaking Polish as to translation accuracy . Click on the Polish version to view it. thx andy

—Posted by Andy Angielski on 02 Sep 2005 @ 09:05 UTC #