Photographic metadata

Volume 10, Issue 87; 31 Aug 2007; last modified 08 Oct 2010

Metadata, metadata, where's my photographic metadata? In EXIF. In RDF. In RDF in EXIF. In RDF. In EXIF. In Lightroom. More or less.

When I first started working with digital images, I was delighted that useful metadata was stored within them. Back in my film days, I used to try to keep a record of exposure information, developing information, etc., but I was never very successful. I have all the negatives from those days but none of the notes.

Having some metadata motivated me to keep more and I began trying to maintain information about who (or what) was in each photo and where it was taken. Following the lead of projects like the co-depiction experiment, I stored the metadata in RDF. (The RDF data model is well suited to photographic metadata; what you're storing are the values of properties about the photo: in a word “triples”.)

Unhappy with having to maintain images and metadata in separate files, I cooked up JpegRDF to store the RDF metadata in the images, using the EXIF comment field.

That worked pretty well for several years. But when I started shooting raw images, two things happened. First, I couldn't easily put the RDF in the images. Raw images have EXIF metadata, but JpegRDF couldn't write to them. Second, the files got a lot bigger so processing them was noticably slower.

My initial fallback position was to return to storing the RDF metadata in separate files.

I had high hopes for photodata.org but it didn't really pan out. I think there's an interesting web application in there somewhere, but I couldn't find it. And when an Ubuntu upgrade completely broke it, I gave up. At least for now, I don't promise never to return to it.

One of the reasons to switch was to try out commercial solutions to this problem. I selected Lightroom partly because I read that it stored its database in SQLite and I knew I could turn that to my advantage if I needed to. (In fairness, the other contender, Aperature, may store its data in SQLite too, I don't know.)

Lightroom stores a really impressive collection of metadata.

The question was, how to turn several tens of megabytes of RDF into something useful in Lightroom. I had imagined that this would involve pouring over the database SQL with Perl, but it turned out to be much simpler.

First, I had to simply accepted a couple of shortcomings.

Lightroom doesn't seem to provide any way to keep and edit the EXIF comment field. In the past, I've used that to store abitrary metadata, which is probably what I would have done if I could. Everything has to go in Lightroom's fields. Luckily, there's a lot of room and provisions in the EXIF (IPTC, etc.) metadata hold keywords, locations, etc.
In RDF, you can easily distinguish between a specific thing and an instance of a class of things. That is “Norman Walsh”, a specific person, on the one hand and “a bird”, not some particular bird, but an anonymous exemplar of that class, on the other. (It's the sort of technical detail that I'm just anal-retentive enough to care about.)

If there's a way to maintain this in Lightroom, I don't see how. I gave up. I turned them all into keywords and will work out which is which later if I need to. I try to identify things with URIs from WordNet or Wikipedia; those are mostly examples of anonymous exemplars. For people and events, I make up my own URIs. For locations, Lightroom stores the actual street addresses.

The simpler bit I alluded to earlier is the fact that Phil Harvey’s excellent ExifTool can write all the EXIF metadata that Lightroom can read. Even the street addresses and GPS coordinates!

So I turned foaf:depicts and dc:subject into EXIF keywords; I extracted the GPS coordinates out of dcterms:spatial; and I turned dc:coverage URIs into street addresses. (The RDF in my personal information manager made the mapping easy.)

Having massaged all the RDF metadata from several thousand images back into EXIF, I loaded them up into Lightroom and, lo and behold, it all just worked.

My only tactical error was representing WordNet and Wikipedia URIs as keywords with colons. So “Flowers” from WordNet was represented with the keyword “wn:Flowers”.

It turns out that when you're entering metadata in Lightroom, it interprets the “:” like a “,” and “completes” whatever keyword you're typing. So I used my Perl-grovel-over-SQL trick after all to rename all the keywords with colons into keywords with hyphens.

So my new workflow is:

Move images off the camera:
```
$ copycf
```
(My copycf script actually needs a little work, but let's pretend. The name, by the way, is a holdover from when I used to copy data off a CompactFlash card. With my Nikon D80, I just copy them straight off the camera.)
Apply batch metadata, including GPS track information, if there is any.
```
$ toxmp -a -2:30 *.nef dc:subject=wn:Bald_eagle dc:coverage=us-ma-south-hadley
```
(Note that I still use the colon forms outside of Lightroom.)
Import the photos into Lightroom.
Add more metadata as appropriate, select the good photos, “develop” them, and export them as JPEGs.
If they're going to Flickr, upload them.
```
$ flickrpub -public 1 *.jpg
```

And it works! See:

I've noticed that Flickr actually extracts the keywords (e.g., the wn-* ones) from the EXIF. I haven't figured out what to do about that.

Comments

Flickr also extracts the XMP information automagically to fill the description, the title and the keywords (tags).

It would be fun if somehow the XMP information could be translated as tag machines and the opposite too. tag machine as rdf triples when exporting from Flickr.