YADFW

Unsurprisingly, Ubuntu’s release has generated some discussion in Debian. Odds on it won’t create much else. Anyway, Scott (a Canonical employee and dpkg hacker, among other things) writes:

Release, release, my kingdom for a release!

[…]

I think he’s missed something major there, and that something major is the reason I think Debian finds releasing difficult.

Testing.

Clearly, this requires a response.

Continuing on…

His point would be valid if Debian Developers ran the distribution that gets released, but they don’t. Most Debian Developers run unstable but the distribution that gets released is testing. This means that it doesn’t really matter to them whether there is a release or not, it doesn’t affect the system they run on their machines.

[…]

Testing was created to provide an almost-releasable version of Debian at all times, instead its separated most developers from the release so they’ve lost interest in it.

First, let’s deal with the obvious mistakes. It’s easy to claim that no one’s interested in a release, but it’s far from true — plenty of people want a release as soon as possible; the issue that’s hard is working out what should go in the release, and fixing the remaining problems. No doubt you can find a decent number of folks who don’t care about releasing, but it’s far easier to find people who do want stable updated. Looking at the Canonical payroll would probably be a good start, eg. That Scott himself is willing to offer an entire kingdom for a release should have been some indication that the desire, at least, is out there.

In a sense, that isn’t really an avoidable issue anyway. Scott joined the project in late 2001 apparently, so might have missed the way we used to do things when releasing which was to rename the “unstable” distribution to “frozen”, create a new “unstable”, and then try to make “frozen” releasable. That had the same problem Scott was complaining about — that people might just keep running unstable, and ignore the release issues that need fixing. But hey, that’s the way things are: Debian’s a volunteer project, which means if people want to ignore particular problems then that’s their right. Sure, you can try to force the situation, but you don’t have any leverage: the only thing you can do is prevent people from helping you in other ways than the one you particularly want.

And let’s have a quick look at the question of just how hard testing does make it to release. Of the 2225 packages in warty/main, 989 are “ubuntu” specific updates, 625 match the version in testing, and a further 544 are older than the version currently in testing, and 9 are from unstable (5 of which have had further updates). Of the remaining 58 packages, 52 are Ubuntu-specific packages, and the others are apt (whose warty version has an NMU number, rather than an Ubuntu NMU number) and gnome-terminal (which seems to have been pulled from experimental).

(Side note: of the 989 updated packages, only 294 are still “newer” than Debian; 613 of them have a higher version in testing than warty, and 88 have a higher version in unstable. That’s not terribly meaningful, though, since warty specific patches haven’t necessarily been included in the updated Debian packages; and conversely some of the Debian changes may have been backported into the Warty packages.)

So, it seems roughly reasonable to say that testing does about 80% (625+544+613) of the work for you, and of the remaining 20% of the work, Debian only does around 4% (9+88) anyway. Of course, that’s only counting the easy bit: Debian includes over 14,000 packages these days.

Once we start looking at the 11556 packages in Warty’s universe, we find another 679 “ubuntu” updates, 5889 packages matching the testing version, a further 4193 that are older than the version in testing, an additional 280 packages that have been dropped from testing, and 515 packages pulled from unstable (247 of which match the current unstable version). In which case, testing appears to get you about 90% of the way there, while Debian could conceivably only get you an additional 4.5%.

The other interesting question is how much better (or worse) testing does than unstable as a basis for constructing warty. For warty/main, there are 51 packages where warty matches testing, but not unstable, versus 5 for unstable, but not testing; for universe the numbers are 415 and 247 respectively. These are pretty much dwarfed by the number of packages in warty that are older than the corresponding packages in testing though, and I’m not sure how you could meaningfully factor that in. It’s also true that it’s harder to “downgrade” in Debian than it is to upgrade; so starting off your distro with a buggy “2.0” and going back to “1.5” is usually harder than starting off with a boring “1.0” and moving up to “1.5”. I’m not sure how you’d factor that in, either.

It’s interesting that testing gets 80% and 90% of the way even in spite of the fact that testing has 11 architectures, and warty only has three. It’s possible Canonical aren’t taking full advantage of this yet; that is, pulling packages from unstable when they’d be acceptable for testing apart from architecture specific problems. It’s also possible that the gains got mostly wiped out during the four month warty freeze (June 28th to October 20th).

Now, you can blame testing for Debian’s problems all you like, but personally, I don’t buy it.

My opinion on what’s behind Debian’s release problems is as mentioned above: we’re hobbled by the difficulty of deciding what we want to release, and the task of getting people to actually make the improvements we choose to insist on.

Major examples of the former in the past year have been the nonfree and amd64 issues, but there are dozens of less obvious questions that are raised and resolved every week beyond those.

Solutions to the latter come in three forms: find other people, provide better incentives for doing the work, or provide less disincentives for doing the work. There’s also the possibility of providing disincentives for not doing the work, but as above, I don’t think that tends to be particular effective. You can compare the Ubuntu environment with Debian on these issues: there are few new people working on it compared to Debian (notably jdub), there are plenty of additional rewards for doing the work (money and kudos), and a reasonable chunk of disincentives have been actively removed (no non-Debian day job to worry about, flamewar free lists, effective decision making, and a single boss to worry about pleasing, rather than a thousand developers ready to sign a GR proposal, or whine on their blogs).

In my opinion, Debian’s particularly rife with disincentives to contributing. As a trivial example, discussing the release process always results in people coming out of the woodwork to complain about testing. I can assure you that “That thing you spent a couple of years thinking about and building? Yeah, it fucking sucks.” isn’t high on my personal list of interesting conversations. Sure, testing’s far from perfect; but the biggest imperfection has always been the lack of ongoing security support, and just try getting past the resistance to that. And what’s the point of devoting yourself to solving minor issues, while the big ones just sit there, year after year?

Anyway, an equally good example for Scott’s point would be “woody plus backports” — its existance stops folks from bothering to run the forthcoming release and resolving the bugs therein too. Is it a huge problem for releasing that must be stopped too?

For reference, I don’t run unstable on any machines; I have a machine running testing, a machine running warty (dist-upgraded from testing), and a few machines running stable with a handful of miscellaneous backports. And it seems like good odds that my new iBook will end up running Mac OS X most of the time when it arrives, and that I’ll need to buy a copy of Microsoft’s VirtualPC if I want to run Linux on it, without giving up wireless connectivity. Yay.

UPDATE 2004/10/23:

Actually, maybe it is possible to get some analyis of whether testing or unstable would make a better base for warty. Of the packages in warty/universe, 513 are older than what’s in testing while there’s a newer version in unstable, and 415 match the version in testing, though there’s a newer version in unstable. Conversely, there are 262 packages newer than what’s in testing (though older than unstable), and 247 packages that are newer than what’s in testing and match what’s in unstable. So that’s 927 packages where warty and testing agree on avoiding unstable, and 509 packages where testing’s being more reticent than warty about getting newer software. Which is a 9:5 split in favour of working from testing rather than just using unstable (or 65% to 35%). As above, the real problem is that getting the 65% of packages reverted is a harder job than updating the 35% of packages. If you’re clever (eg, if you know exactly which packages make up the 65% from the very start), that’s a trivial problem. If you’re not (eg, you only realise your updated gcc has made your distro incompatible with everyone else, and you need to rebuild everything), it can become ludicrously hard. It’s hard to say where between those two extremes Ubuntu would lie.

(Note that the old Debian release process didn’t do reversions — we fixed the bugs, no matter how long it took. Reverting a package will tend to cause apt or dselect not to update it, which screws your users. Reverting via an epoch is irritating, and can break dependencies, especially when shlibs are involved ((>= 1.03) matches 1:1.02, which it shouldn’t if 1:1.02 was meant to be the same as 1.02). Having to fix bugs, rather than just switch to a non-buggy version of the package makes things even harder, of course.)

Unfortunately, all that could be just noise compared to the 3677 packages that are the same in testing and unstable, but for which there’s an older version in warty. That’s not counting ubuntu specific updates (assuming their versioning scheme is consistent, anyway), so should be purely attributable to the “freeze, hack, release” model they’ve adopted.

OTOH, that’s around a third of warty/universe that testing’s arguably done a “better” job of managing than a bunch of paid hackers. Or, at around 544 of 2225 packages, around 24% of warty/main. (Versus 44% Ubuntu specific, 28% handled equally well, 3% handled worse, and 1% rounded into non-existance)

Though I think that’s still not really very enlightening. It might help to know what proportion of packages were pulled from unstable “early” during warty’s development; but which testing later included anyway. Analysing which of the ubuntu specific changes have made it into testing, unstable and experimental would also be interesting.

Also, I should note that as much as I’d like it to be otherwise, I absolutely don’t recommend the use of testing for multiuser machines, or machines that offer any services over a network where you don’t have complete trust in everyone who has access to that network. You can do it, but you do so at your own risk, in the face of published security problems that not infrequently remain unfixed for extended periods. It’s possible to fix that situation, but the comments above about disincentives are particularly applicable here.

Leave a Reply