Feeds

Volume 9, Issue 17; 14 Feb 2006; last modified 08 Oct 2010

My plan to remove RSS feeds caused some consternation in the community. In addition to pointing out some places where RSS is still needed, a workaround was proposed. So, before I pull the plug, let's see if the workaround will…work. [Update: no, it won't.]

Try as hard as we may for perfection, the net result of our labors is an amazing variety of imperfectness. We are surprised at our own versatility in being able to fail in so many different ways.

—Samuel McChord Crothers

The problem I have with RSS is that it requires escaping markup. And I won't. I've avoided the issue by (almost) never putting any markup in essay abstracts or titles. If there's no markup, there's nothing to escape. But every now and then I want to use markup in the abstract or title and being constrained by RSS irritates me. That was my motivation for pulling the plug.

In response to comments about my plans, I've made three changes. In decreasing order of significance:

One suggestion is to use rdf:parseType="Literal". There's some concern that the RSS 1.0 specification doesn't allow this. I dunno. The spec says “RSS is an XML application, conforming to the W3C's RDF Specification. RSS is extensible via XML-namespace and/or RDF based modularization.”

Taking that as justification, I've adjusted the stylesheet that generates RSS so that if a title or description contains markup, it is identified as XML content in the RDF.

Amazingly (considering all their parsing bugs) Bloglines gets this right. The Feed Validator is not quite so sanguine, but whether that's my bug or theirs is an open question I think. (Why the validator complains about my namespace declaration for the RSS namespace is a little baffling, but that's something else entirely.)
I've adjusted the stylesheet that generates Atom so that the full text feed includes both the summary and the content.
I've fixed the bug where my Atom feeds didn't include a rel="self" link.

This essay serves as a kind of test case. The abstract contains a couple of examples of markup. I hope that even tools which don't understand rdf:parseType will fallback gracefully. I hope.

If this solution goes over well, I'll leave the RSS feeds alone. (It's not actually hard to not delete them.) If not…I dunno. I could still pull the plug, but I suppose I could arrange to discard the markup too. Time will tell.

[Update: 15 Feb 2006: As Ken points out in the comment below, title and description are defined as text-only, so the rdf:parseType solution doesn't work. Well, at least, markup isn't allowed. Whether or not rdf:parseType="Literal" should allow “<”, “>”, and “&” to appear without double-escaping them may still be an open question, but I give up. And I renege. I won't drop RSS. Instead, I've adjusted the stylesheet that generates RSS so that it replaces “<”, “>”, and “&” with “✕” and discards other markup. Yeah, that makes for titles like “Drop the ✕!DOCTYPE✕” in RSS; if you don't like that, use the Atom feeds.

For the record, the list of Atom and RSS feeds includes a “what's new” feed (Atom, Atom (full-text), and RSS), an “everything” feed (Atom, RSS), and a fairly haphazard selection of topic-oriented feeds. Most, but not quite all, are available in either Atom or RSS. (The comment and Subversion log feeds are Atom-only.)]

Comments

Hi Norm, in RSS/RDF 1.0 the content model for <title> and <description> is (#PCDATA), further constrained than the general "is XML and RDF" statement.

Unfortunately the portion of the RSS/RDF Content module that does support rdf:parseType="Literal" is pretty well botched and I'm not aware of any clients that support it.

Since RSS/RDF text elements are plain text, you should let those character entities go out as-is. Newsreaders that get it wrong are not spec compliant.