Debugging Debootstrap

Contrary to expectations, last week’s AJ Market project turned out to be debootstrap, not dak. Just goes to show a single person can make a difference in today’s world: debootstrap popped into the lead from nearly the bottom thanks to a single contribution. (I wonder if it makes more sense to make contributions anonymous or not?)

For those playing along at home, debootstrap is a tool to build a Debian system from (almost) nothing — it just requires some basic POSIX shell functionality, things like sed, grep, sort, ar, tar, gzip, and wget. So it’s useful when you’re initially installing Debian (or derived distros like Ubuntu), or if you’re creating a chroot environment for dedicated build environments.

Anyway, the debootstrap changes were pretty much bugfixes only, so there’s no fancy new features (okay, there’s one: a --make-tarball option) but a few of the fixes are worth a quick look.

  * Don't create empty available files, since old dpkg and new kernels can't
    deal with them. (Closes: Bug#308169, Bug#329468)

Reasons why POSIX is no fun: it standardises things that don’t match what programs actually do, then people recode their software to match POSIX, and programs that relied on the old behaviour break. In this case, Linux kernel’s mmap() behaviour on empty files changed — it’s now invalid to try to mmap an empty file rather than just giving you an empty buffer. Unfortunately dpkg tries to mmap its available file, and debootstrap hands dpkg an empty available file, leading to unhappiness. Reportedly dpkg was the only app relying on the old mmap behaviour…

  * Turn on --resolve-deps by default. Add --no-resolve-deps as an option.

An alternative to debootstrap is cdebootstrap, which is written in C instead of shell, and whose main claim to fame is the ability to work out which packages to download entirely dynamically, rather than needing to be updated when the base system changes. debootstrap finally got that feature too in 0.3.0, but the default was to do it in only a limited way, which was to expect the Priority: field in the Packages file to correctly tell you which packages you need. That’s turned out to be a bit of a nuisance, though, so we’ll see what happens with doing it the cdebootstrap way instead.

  * Catch failures in "dpkg --status-fd" (Closes: Bug#317447, Bug#323661)

One of the problems of writing debootstrap in sh is that it’s a horribly kludgy language. So when you try doing something a little bit intricate, in this case trying to interface dpkg and debootstrap so debootstrap can summarise dpkg’s progress at install packages, you get into all sorts of problems. We had the problem here that we need to (a) get the regular output of postinsts and so forth for the user, (b) get the –status-fd output, parse it, and paraphrase it in a form suitable for either the user or another tool like debian-installer, and (c) pass through the error status returned by dpkg and possibly immediately abort the install. To do (b) you need to pipe dpkg somewhere, but that can only be done with stdout, which is where the output for (a) was going to go, so you have to reroute those two things, then route them back: that means you need two spare FDs, and you need to say: (dpkg --status-fd N N>&1 >&M | parser) M>&1. But that’s not enough, since the pipe loses dpkg’s exit code and the subshell stops you from automatically aborting, either of which violate (c). The solution ended up being to instead say (dpkg --status-fd 8 8>&1 1>&7 || echo EC $?) | parser having made sure &7 goes to stdout earlier (exec 7>&1) and making sure the parser would note the EC string and use that for its exit status.

  * Use partial/ directory when downloading. (Closes: Bug#109176)

One thing that was particularly pleasing was to be able to fix the second oldest open debootstrap bug (well, the oldest non-wishlist bug), filed just over four years ago by apt author, Jason Gunthorpe, about how I was misusing apt’s cache, by not coping with multiple versions of packages being present, and not using the partial directory. The latter actually has some (fairly mild) security implications, in that if debootstrap downloads a hacked deb or Packages file, but is aborted before it has time to see that the md5sum doesn’t match what it should, it’ll leave the partial download somewhere apt will blithely trust. That got mostly fixed in the 0.3.0 upload, and the final bit — using the partial directory — is now done too. Sweet.

In any event, that’s what last week’s contribution bought. Now’s the time for you to decide what this week will bring!

Leave a Reply