chroot X!

Volume 6, Issue 73; 22 Aug 2003

The perils of proprietary hardware and how I worked around a particularly irksome bug.

Proprietary data is the root of tyranny.

—Britt Blaser

I've had my laptop for several years now. It's a Toshiba Tecra 8200 and for the most part I've been quite happy. I use my laptop all the time, it's the only computer I use at home or on the road.

One of my requirements for a laptop is a display wider than 1280 pixels. I like to have an Emacs window and a couple of shell windows up side-by-side without any overlap and enough space left over for a few little desktop monitors, like the desktop switching applet.

What drew me to the 8200 then was partly the 1400x1050 display. Sweet. A nice compromise, actually, between 1280x1024 which isn't big enough and 1600x1200 which can make for some eye-strainingly small details on a laptop-sized LCD.

So it's a nice laptop, but the damn thing has three bits of proprietary hardware. Proprietary hardware, that is to say, hardware for which the vendors will not publish reasonable, open API specifications, sucks. Avoid it if you can.

In the case of the 8200, the modem is some “winmodem” piece of junk; not one of the ones for which I've yet seen a Linux driver. But I don't care, a $30 PCMCIA modem card on eBay fixed that problem. There's also never been a driver for the irDA chipset. It would be nice if that worked, but I can live without it. And then there's the video card.

The XFree folks have worked hard to get a working driver for the Trident family of cards, but in the case of the “CyberBlade/XP” card in this laptop, they've met with only partial success. The best XFree driver out there will only run at 1400x1050 with a very odd “left shift” problem. The left-most 16 pixels of the display appear on the right-hand side of the screen. Yeah, it's just as odd as it sounds. I tried for about a day to get used to it and decided that I couldn't.

So how do I use my laptop? The answer is that I bought a commercial X server, the Accelerated X product from Xi Graphics.

All fine and dandy until I switched from Debian “Woody” to “unstable”. A week or so ago, a new libc was released on unstable. It's not “binary API” compatible with the old libc. You can see it coming, right?

You guessed it, the X server no longer runs because of the binary incompatibility in libc. And Xi no longer supports the product I'm using because they have a new product line. And their new product line doesn't support the Trident hardware I've got because they can't get Trident to reveal the [expletive deleted, ed.] specs to them either.

Backing out the libc upgrade got things working again, but left my machine in an intolerable state where some apps ran and some didn't, according to what level of the libc library they depended on.

Note

At this point, the essay is going to take a decidedly technical turn. If you're not using Linux or some Unix variant, the words may cease to be meaningful at any point. Avoid proprietary hardware, that's the takeawayIn all honesty, I suppose I'm over simplifying things. If you're content to run a proprietary operating system and you're confident that the proprietary hardware you have will always be supported by that operating system and you'll never want to try another operating system for which that isn't true, what difference does it make to you if it's proprietary or not?. Commercial, closed-source software leaves you screwed too, if you need updates and the vendor isn't supplying them. But I figure you already know that lesson.

What to do? Saving the old libraries and updating the LD_LIBRARY_PATH didn't work. Rob Weir pointed me in the right direction, use chroot, and Sebastian Kapfer provided the clue I needed to get it running. (Both of these guys responded to messages I posted on the <debian-user@lists.debian.org> list. Thanks, guys.)

chmod, chown, chgrp, chroot???

Yes, chroot. What it does is change the root directory for a given program. If you run chroot /dir command, what happens is command runs with its root directory set to “ /dir ”. That means that while the program is running /path/file actually refers to /dir/path/file on disk. Neat, huh? “And this is useful, why,” you ask?

It's useful because it means that I can setup a separate directory with all the old libraries and run Accelerated X in there. This will insulate the old application from any changes to the system libraries. And the client/server nature of X means that all the new apps will be able to display on the old server with no problems. (X may be a little clumsy to configure, but it rocks!)

First I created a new directory, the future root directory, for Accelerated X and copied the binaries and configuration files that it uses into that directory, mirroring the directory hierarchy of the real root. (In other words /usr/X11R6/bin/Xaccel got copied to /AccelX/usr/X11R6/bin/Xaccel, etc.)

Then I copied the old libc files into the new root. I also copied /bin/bash, /bin/ls and a few other things over so that I could run:

chroot /AccelX /bin/bash

to check things out.

Next the question is, what other libraries will I need? Remember, after chroot, only the files under /AccelX will be visible. The ldd command helps here. For example:

$ ldd /bin/bash
        libncurses.so.5 => /lib/libncurses.so.5 (0x40014000)
        libdl.so.2 => /lib/libdl.so.2 (0x40050000)
        libc.so.6 => /lib/libc.so.6 (0x40053000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

Tells me that to run bash, I need to copy over the libncurses and libdl libraries in addition to the libc library (which I've already copied).

“Repeat until done,” as the saying goes.

After I got all the libraries in place for all the commands that I wanted to be able to run, I was able to try it out. It didn't work, of course. I was able to chroot /AccelX /bin/bash, but when I ran Xaccel, it died.

The next step was strace. The strace command shows you all the system calls that an application attempts. For example, here's the start of a trace of ls (I've edited the output a bit to make it fit neatly in the column):

$ strace ls
execve("/bin/ls", ["ls"], [/* 85 vars */]) = 0
uname({sys="Linux", node="mercury", ...}) = 0
brk(0)                                  = 0x8059d50
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE,
      MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40013000
open("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT
     (No such file or directory)
open("/usr/local/soadabas/lib/i686/mmx/cmov/librt.so.1",
     O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/local/soadabas/lib/i686/mmx/cmov",
     0xbfffe2a0) = -1 ENOENT (No such file or directory)
...

All those ENOENT errors show attempts by ls to open files that don't exist. By examining the trace of Xaccel, I was able to determine what other files it wanted to access, and by copying the ones that actually existed on my system over into the /AccelX root, I was able to get the program to run.

(Actually, I had to use mknod to setup some devices and do a few other things, but there turned out to be a better way.)

Success! Almost.

I could run X now, but nothing could talk to it. The problem, informally, is that the server communicates some vital information to the rest of the world through a special file under /tmp. But the server's /tmp is really /AccelX/tmp so applications aren't seeing the server's file.

The last little trick turns out to be a feature of mount that I didn't know about. The --bind option allows you to make the same directory appear in two different places. Why do you want to do that? For occasions just like this one. Running:

mount --bind /tmp /AccelX/tmp

makes /AccelX/tmp the same as the real /tmp. So now the /tmp that Xaccel sees when it's been chrooted is exactly the same /tmp seen by the rest of the applications on my laptop.

So instead of making devices in /AccelX/dev, I can just bind /dev. The same is true for: /proc, /var, /usr/share, and /usr/X11R6/lib/X11/fonts. The last two assure that I get the right fonts.

With the right mounts and the right libraries, I can now run chroot /AccelX /usr/X11R6/bin/Xaccel to start X and everything works just fine, even with the rest of my system upgraded to the latest libraries.

(I did have a few moments consternation when I was trying to run X with output redirected. Apparently that doesn't work, it interferes with interprocess communication too.)

In any event, success! Whew!

Comments

Damn, that's a neat solution. I like it. However, I reckon that you could have got away without mounting /tmp, and perhaps using the X's network transparency by connecting to localhost:0.0 instead of just :0.0. With a modern tcp stack the overhead shouldn't be much more than a Unix domain socket.

Anyway, nice work.

-Dom

Rock on! That --bind option is just what I was looking for. The --move option to mount looks pretty interesting, too --- a few months ago I was looking for a way to atomically update an entire webserver root in order to do atomic software upgrades, and that might just do it (create upgrade in new temp dir, mount --move it over the existing root, upgrade the (hidden) existing root, umount the temp dir).