Last month we had a brief discussion on debian-devel about what images would be good to have for lenny — we’re apparently up to about 30 CDs or 4 DVDs per architecture, which over 12 architectures adds to about 430GB in total. That’s a lot, given it’s only one release, and meanwhile the entire Debian archive is only 324GB.

The obvious way to avoid that is to make use of jigdo — which lets you recreate an iso from a small template and the existing Debian mirror network. I’ve personally never used jigdo much, half because I don’t usually use isos anyway, but also because the few times I have tried jigdo it always seemed really unnecessarily slow. So the other day I tried writing my own jigdo download tool focussed on making sure it was as fast as possible.

The official jigdo download tool, ttbomk, is jigdo-lite — which you give a .jigdo file, and the url of a local mirror. It then downloads the first ten files using wget, and once they’re all downloaded, it calls jigdo-file to get them merged into the output image. This gets repeated until all the files have been downloaded.

To avoid this, you want to do multiple things at once: most importantly, to be writing data to the image at the same time as you’re downloading more data. With jigdodl (the name I’ve given to my little program), I went a little bit overboard, and made it not only do that, but also manage four downloads and the decompression of the raw data from the template. That’s partly due to not being entirely sure what needed to be done to get a speedy jigdo program, and partly because the communicate module I’d just written to deal with this sort of parallelism making that somewhat natural.

In the end, it works: from wireless over ADSL to my ISP’s Debian mirror, I get the following output:

Jigsaw download:
Filename: debian-40r3-amd64-CD-1.iso
Length:   675477504
MD5sum:   d3924cdaceeb6a3706a6e2136e5cfab2
Total: 679 s; d/l: 586 MB at 883 kB/s; dump: 57 MB at 57 MB/s

Finished!


which is only slightly short of maxing out my downstream bandwidth, taking a total of about 11m20s. Running jigdodl with a closer mirror works pretty well too, though evidently some of my more recent changes weren’t so great, because I’ve gone from 9153 kB/s on a 100 Mbps link down to 7131 kB/s or lower. The CPU usage also seems a bit high, hovering at between five to ten percent at 900 kB/s.