XCF: bah, humbug!

Volume 9, Issue 59; 18 Jun 2006; last modified 08 Oct 2010

I know the calendar is almost exactly out of phase with Christmas, but “bah, humbug” anyway. Gimp's XCF file format is deliberately undocumented. Bah freaking humbug! [Update: a spec is under development!]

I've been starting recently to take my photography a little more seriously. Partly, that means setting time aside sometimes just to try to improve my technique and to take a lot of (mostly mediocre, at best) pictures. (Taking lots is the only way I know to learn how to take better ones.) Partly, it means I've been trying to better understand and appreciate the digital image work flow process.

And that means I've been spending more time in GIMP trying to understand how to process raw images from the camera and learn about things like layer masks and sharpening.

And now I learn that GIMP's native file format, XCF, is not merely undocumented, it is intentionally undocumented.

Danger, Will Robinson! Undocumented binary black hole! Danger!

I don't claim any particular expertise in designing graphic file formats, though I've reverse engineered a few simple ones in my day, but the right answer here seems pretty clear. As near as I can tell, an XCF file consists, logically, of several bitmaps of varying sizes and depths, some metadata relating them, and a bunch of ancillary data (palettes, brushes, selections, etc.). Pick some straightforward encoding for the bitmaps (RLE, LZW, or anything lossless, really), design an XML format for the rest of the data, and package them all together somehow (maybe ZIP, maybe something else). I don't see any particular merit in base64 or otherwise encoding the binary data directly into the XML, but I don't see why that should prevent the rest of the data from being in XML.

I really don't know, right now, what the right answer is, but pouring more of my heart and soul into someone's proprietary black hole isn't it.

Bah #@$&!?% humbug!

[Update, 20 June 2006: Based on Sven's comments below and other postings, it seems that perhaps the situation isn't as dire as it first appears. I hope this turns out to be the case, because I really care about this data.]

[Update, 21 July 2006:

<foaf:name>Dave Neary</foaf:name>
reports that
<foaf:name>Henning Makhomlm</foaf:name>
has stepped up to write a specification for the XCF format. Good on ya, Henning!


i wrote about the cultural issues around open source interoperability. i have been wondering how the open source community could be provided with more incentives to take these issues seriously.

—Posted by Gregor J. Rothfuss on 18 Jun 2006 @ 08:31 UTC #

There's some sort of a proposal for an XCF file format successor at http://pippin.gimp.org/xcf2/ and it's the GIMP developers declared goal to introduce such a format with the switch to GEGL. Since GEGL is just waiting to be introduced to GIMP after the 2.4 release, XCF is going to be replaced by a truly open format in a not too distant future.

You should understand that XCF was never meant to be a file format for exchanging graphics between different applications, it's the GIMP default file format and its only purpose is to be able to store the complete state of the image as edited in GIMP.

—Posted by Sven Neumann on 19 Jun 2006 @ 10:05 UTC #

Also let me add that Mark Pilgrim completely misses the point since the file format is documented. The documentation may be somewhat rudimentary, but together with the source code, it provides enough details to decode an XCF file. Other applications like ImageMagick have successfully added support for reading XCF based on this.

The format isn't deliberatlely undocumented, we only discourage other applications from adopting it because we think that it's not well suited as an image exchange format. If another format was developed that would be open and suited our needs, GIMP would be one of the first applications to adopt it.

—Posted by Sven Neumann on 19 Jun 2006 @ 10:15 UTC #

"""together with the source code, it provides enough details to decode an XCF file"""

This is a very dangerous sentiment, and unfortunately it's all too common in open source projects. The source is THE ABSOLUTE WORST form of documentation, and statements like that do nothing but impede real interoperability.

On this topic, I highly recommend the dumbing-down of programming, which was written eight years ago (and is still online!) -- the author nails this point. Coding is the act of incremental forgetting. Eventually all you're left with is code, without any human expertise to back it up, fix it, improve it, or interoperate with it.

—Posted by Mark on 21 Jun 2006 @ 01:25 UTC #