Where am I?

Volume 13, Issue 7; 06 Mar 2010; last modified 08 Oct 2010

Or, perhaps more to the point, where was I? And where will I be?

I've long been fascinated by geospatial data. I do a little geocaching. I keep track of the countries I've been in. I use services like Dopplr and TripIt to keep track of my itineraries. I carry a GPS to geotag photographs.

Carrying a mobile phone with a GPS allows me to explore the features of geosocial applications like BrightKite, Gowalla, and Foursquare. I allow Dopplr and Brightkite to update my Fireeagle location.

All very nice, but there were two obvious (to me) deficiencies in this arrangement. First, and most obvious, my data is spread all over someone else's servers. I consider this unacceptable. Any one of these services could get bought, go belly up, or carelessly (or maliciously, I suppose) discard all my data.

The second point is less obvious: how do I tell when I'm home? I don't actually think any of you reading this would have a hard time working out where I live. I'm sure I've left enough digital clues for anyone sufficiently interested to work it out. That said, I can't actually convince myself that it's reasonable to “check in” to any of these geolocation services when I'm home. I'm not sure it's reasonable to do when I'm not home, either, but I do. I think I've set the services up so they only reveal my location to friends anyway.

Not being able to solve the second of these problems made the data sufficiently unreliable (I've been checked into Staples for eleven days?) that the first problem hadn't crossed my “do something about it” threshold.

All that changed a few days ago when Tom Morris happened to mention a clever solution in his Twitter stream. He later documented it in a thoughtful post. The basic idea is this: if you always carry your phone (or other device) on your person, then the presence of that device in your house means your home.

A little hack later and my server always knows when I'm home, give or take a few hours; it can only ping the device when it's on and not in “standby” so there's some latency. But not more than a few hours most days, I expect.

Having solved the second problem, I turned my attention to the first. A few hours of hacking later and the TripIt, Foursquare, Gowalla, Brightkite, and Fireeagle APIs are giving me my data. The hardest part, honestly, was getting over the authentication hurdles. OAuth may be the right answer, but it's not painless to setup a new application.

I started out by pouring all this data into MarkLogic Server. (Well, I would, wouldn't I?). A little XQuery later and I had a normalized view of all my locations. Cool.

But wait, I thought, what about all those GPS tracks? Yes, those belong in there as well! Easily done.

The interesting thing about GPS tracks is that you can (sometimes) interpolate data between points. I do this already when I'm geotagging photographs. By adding “next point” to the normalized data when appropriate, I could expose that in my system as well.

Once that idea was in place, it was clear that an airline flight or train ride (to a lesser extent) might be subject to interpolation as well. A quick tweak to the scripts that normalize TripIt itinerary data took care of that.

At the end of the day, I have an interesting (to me) personal archive of my geolocation over time. It's derived from GPS tracks, explicit checkins, and itineraries. I'm also going to integrate the GPS data that comes from photographs taken with my mobile phone. All very cool to me.

I've also got a web service that I can use for geotagging photographs. I can ask, for example, where was I this morning at 10:00a?

<point lat="42.360633" long="-72.543451"
       timestamp="2010-03-06T14:51:53Z"
       duration="PT8M7S" seconds="487">Staples</point>

Apparently, I was at Staples and had been for 8 minutes. No wonder I'm the mayor of Staples.

If I ask where I was at 2006-07-15T13:28:00Z, the answer comes from a GPS track:

<path start-lat="42.376713753" start-long="-72.516739368"
      end-lat="42.376821041" end-long="-72.516653538"
      timestamp="2006-07-15T18:27:58Z" end-timestamp="2006-07-15T18:28:03Z"
      total-distance="0.00842566695064306"
      total-duration="PT5S" total-seconds="5" velocity="6.06648020446301"
      duration="PT2S" seconds="2" distance="0.00337026678025723"
      lat="42.3768" long="-72.5167"/>

That's a location interpolated over five seconds and about 44 feet. Seems pretty reasonable.

Interpolating over airline flights is a little less precise:

<path start-lat="42.363611" start-long="-71.006111"
      end-lat="18.3375" end-long="-64.969444"
      timestamp="2010-01-28T14:05:00Z" end-timestamp="2010-01-28T18:00:00Z"
      total-distance="1697.2841796875"
      total-duration="PT3H55M" total-seconds="14100" velocity="433.349152260638"
      duration="PT55M" seconds="3300" distance="397.236722905585"
      lat="36.7753" long="-69.2873">BOS</path>

but it's still kind of cool. Also interesting is the fact that itinerary data lets me look forward. Where will I be at 2010-03-12T08:25:00Z?

<point lat="50.1" long="14.266667"
       timestamp="2010-03-12T08:25:00Z"
       duration="P1DT1H55M" seconds="93300">PRG</point>

Oh yes. I think I have an appointment there. And it'll be more accurate after I've actually checked in, taken photographs, and used my GPS. Sweet.