Micro-blogging Backup, part the fourth

Volume 12, Issue 29; 09 Sep 2009; last modified 08 Oct 2010

In which we get to see what our tweets and ’dents look like.

If you haven't been following along, go back and read parts one, two, and for a little background, three first. Now you've got MarkLogic Server up and running and you've been able to download your tweets and the tweets of those you follow. (Tweets or ’dents depending on which microblogging service you prefer; either, actually both, work for me.)

Next, download mbb04.zip and unpack it in the same place where you unpacked mbb02.zip. If you were following the instructions, you've edited some of the files in the “mbb/inst” directory and you may have some sessions saved in CQ, so I have not included those directories in mbb04.zip.

If you've been tinkering with other files, then you want to unpack this zip with some care or you may overwrite your changes. But you won't overwrite changes made to the installation or CQ areas. (By the same token, this distribution is incomplete without mbb02.zip).

With that installed, you now have a CSS file, four modules, and a new “top level” script, show-tweets.xqy. That's the fun one. Point your browser at http://localhost:8330/show-tweets.xqy and you should be rewarded with a list of your status messages from today. (As before, adjust the port number as necessary if you installed the application server on a different port.)

If you don't have any status messages from today, load http://localhost:8330/get-tweets.xqy to download your most recent messages, then try http://localhost:8330/show-tweets.xqy again.

The show-tweets.xqy script accepts query parameters. You can pass:

sdate

To specify the starting date in “YYYY-MM-DD” format. If unspecified, defaults to the ending date.

edate

To specify the ending date. If unspecified, defaults to today.

users

To specify one or more users separated by spaces (+ signs or %20’s in URI-speak). These should match the screen_name values in your account configuration. The value “ALL” is special, it will list tweets for every user, including all your friends.

service

To specify the service. If you setup accounts on multiple services, this will let you limit the result to only those messages on a single service.

Go ahead and give it a try, http://localhost:8330/show-tweets.xqy?sdate=2009-08-01&edate=2009-08-31&users=ALL will show you all the messages by you and those you follow posted in the month of August.

Message Formatting

The messages are sorted in ascending order by date, so that messages read chronologically “down” the page as you'd expect. However, conversations are handled a little bit specially.

Whenever a message is encountered that is part of a conversation (either because it's a reply to another message, or another message exists that is in reply to it), the whole thread is collected together and presented as a unit, like this:

I think that makes the results much easier to follow. We'll come back to what to do about threads involving users other than you or your followers later.

Note

There were some bugs in the Identi.ca server that occasionally caused incorrect “in-reply-to” values to be inserted into the data. I think those have been fixed in the server, but there's nothing I can do about the values that are wrong.

URL Rewriting

Those show-tweets.xqy URLs may be sufficient, but they're hardly elegant. I'd be much happier if they were better organized for human consumption.

Luckily, with the URL rewriting features of MarkLogic Server V4.1, this is easily achieved. To begin with, go back into the admin console (http://localhost:8001/) and navigate down through Groups→Default→App Servers in the tree control then select the server you setup for this project.

Near the bottom of that page, you'll find a “url rewriter” field. Specify modules/url-rewriter.xqy as the value for that field and then click “Ok” at either the top or bottom of the page.

The one I provided supports a range of values designed to be more readable:

/my-tweets/service/users/start-date/end-date

Returns the messages associated with “users” on “service” between the specified start and end dates, e.g., /my-tweets/identica/ndw/2009-08-01/2009-08-15.

/my-tweets/users/start-date/end-date

Returns the messages associated with “users” on any service between the specified start and end dates, e.g., /my-tweets/ndw/2009-08-01/2009-08-15.

/my-tweets/start-date/end-date

Returns the messages associated with any user on any service between the specified start and end dates, e.g., /my-tweets/2009-08-01/2009-08-15.

/my-tweets/start-date

Returns the messages associated with any user on any service between the specified start date and today, e.g., /my-tweets/2009-09-01.

/my-tweets

Returns the messages associated with any user on any service posted today.

/all-tweets/service/users/start-date/end-date

Returns the messages posted by anyone on “service” between the specified start and end dates.

/all-tweets/start-date/end-date

Returns the messages posted by anyone on any service between the specified start and end dates.

/all-tweets/service/start-date

Returns the messages posted by anyone on “service” between the specified start date and today.

/all-tweets/start-date

Returns the messages posted by anyone on any service between the specified start date and today.

The URL rewriter is just an XQuery that you can change; so it can support any kind of values that you'd like. It recieves the URL that the user entered. Whatever URL it returns is what the server actually responds to. The one I've provided turns

/all-tweets/2009-08-15/2009-08-31

into

/show-tweets.xyq?users=ALL&sdate=2009-08-15&edate=2009-08-31

So nothing else has to change about the code, it all just works.

I'm not sure there's much else that's new or interesting about the other modules provided in this part. If you have any questions, feel free to ask.

Next time, we'll look at dealing with all those ugly shortened URLs and pesky replies that extend beyond our followers.