Son-of-RSS Grammar

Volume 6, Issue 57; 10 Jul 2003

Another RELAX NG Grammar for the format sometimes referred to as Pie or Echo. And some more thoughts about Son-of-RSS.

Yesterday, Tim asked me to take a look at the RELAX NG grammar that he put together for Pie/Not-Echo/Son-of-RSS. Before looking at his, I decided to followed the link to Sam's NECHO 0.1 snapshot and take an independent crack at the problem.

I think it's a good sign that my schema and Tim's schema are similar. They differ a bit stylistically, but the technical discrepancies are few (and perhaps consensus is leaning towards the choices Tim made; I haven't been following the discussion in the past couple of weeks):

  • Tim made order insignificant in feed and entry. I thought about doing that but decided to impose one. It's marginally simpler to write applications to consume the feed if the order is fixed and it doesn't seem a hardship on authors in this case. And besides, allowing the feed title to appear half way through a feed, between two entries, seems wrong.

  • I decided to allow authors and contributors to have any number of homepage and weblog elements. (These are the sorts of decisions that need to be nailed down solidly in the specification.)

  • On the subject of contributors, Tim named the pattern he used to define the content model of author and contributorPerson”. Are we really saying that organizational authorship is forbidden? Suppose I want to point to a document authored by “Example, Inc.”?

  • Tim imposed some additional constraints on the content element that aren't justified by Sam's description or his example. (I'm going to suggest some radical constraints below, so I don't think Tim's wrong, I just took a more literal approach to the problem.)

  • I decided to allow xml:lang (and xml:base) anywhere.

  • My grammar allows foreign-namespace content to be mixed into the feed or the content.

  • My grammar allows foreign-namespace attributes everywhere.

  • Tim put some constraints on the version attribute and put the vocabulary in a reasonable namespace. I didn't bother.

I haven't bothered to run my grammar through Trang to produce the W3C XML Schemas and/or DTDs that might be desirable. Feel free.

What's Wrong with NECHO 0.1?

Herewith, my two cents about what needs to change before 1.0. I think this might classify as ranting again.

  • Lose this nonsense about “escaped” content. I'm serious. That is just broken! Whether you use CDATA sections or numeric character references or named entities is irrelevant. This is XML. You get a marked-up stream of Unicode characters for gosh sakes!

  • Actually, lose content altogether.

    My first draft of this essay had several bullets about how to improve content (allow only one, require XHTML, don't allow content by reference), but I had a sort of epiphany when I realized that entries had both summary and content.

    Aha, I realized, the idea here must be that some folks want to stuff the actual content into the feed instead of just pointing to it. (Perhaps I've been spectacularly clueless in not realizing that sooner. Wouldn't be the first time. Or the last.)

    I'm torn. I might be convinced that this is really a requirement, but not easily.

    I don't think I've ever seen a feed that works this way, so I'm inclined to say that if some folks really need it, they can do it in an extension namespace.

  • Rename summary to description. Or not. This is a pretty minor point, but given that Dublin Core has established the name “description”, why not use it?

  • Allow text or XHTML (but nothing else) in description.

  • Put link inside author and lose homepage and weblog. Trying to enumerate the kinds of links people will want to associate with authors is futile. “There are only three numbers in computer science: zero, one, and infinity.

  • Lose subtitles.

  • Add description and publisher (with the same content model as author and contributor) to feed.

  • I'm not sure the distinction between author and contributor is worth preserving.

  • I'm not sure the distinction between link and id is a good idea. Maybe this is related to the idea that the feed would contain the content, but since I don't think that should be standard, I don't think both link and id need to be standard. Lose id.

That's probably not an exhaustive list. And remember, my opinion is not warranted to be worth more than what you paid for it.


> Aha, I realized, the idea here must be > that some folks want to stuff the actual > content into the feed instead of just > pointing to it.

We have this capability now with RSS -- many people put escaped markup containing the full body of their entry in <description>.

—Posted by Mark Hershberger on 12 Jul 2003 @ 10:33 UTC #