QA changes for Ubuntu 12.04

Half a year ago I blogged about the changed expectancies and processes to improve quality of the development release which we discussed at the UDS in Orlando: A promise that we don’t break the development version, regressions are not to be tolerated, acceptance criteria for Canonical upstreams. For that we introduced the Stable+1 team, actually did some reversions of broken packages, our QA team set up rigorous daily installation image and upgrade tests, and the code development process for Unity and related project was changed to enforce buildability and passing automatic tests with each and every change to trunk.

To be honest I was still a tad sceptic back then when this was planned. These were a lot of changes for one cycle, the stable+1 team was a considerable resource investment (starting with three people fulltime in the first few months), and not to the least our friends in the DX team felt thwarted because they had to sit down for a long time developing tests, and then changing their habits and practices for development.

So was all that effort worth it?

One word: OMGCRYOUTLOUDYES!!!!

Just a random sample of goodness that this brought:

  • It was nice to not have to sit down for an hour every cople of days to figure out how to get back my desktop after the daily dist-upgrade bricked it.
  • Unity, compiz, and friends were remarkably stable. I still remember the previous cycles where every new version got differently crashy, broke virtual workspaces, and what not. The worst thing that happened this cycle is eternally breaking keybindings (or changing them around), but at least those usually had obvious workarounds.
  • As a result of those, I think we had at least one, maybe two magnitudes more testers of the daily development release than in previous cycles. So we got a lot of good bug reports and also patch contributions for smaller issues in Precise which we otherwise would not have discovered.
  • The daily dist-upgrade tests tremendously helped to uncover packaging problems which would break real-world upgrades out there by the dozens. It took months to fix the hardest one: upgrading 10.04 LTS to 12.04 LTS with all universe packages offered in software-center. This beast takes 13 hours to run, so nobody really did manual tests like that in the past cycles.
  • Due to the daily automatic CD image builds we dramatically reduced both the cost of fixing regressions as well as the emergency hackathons during milestone preparations. It is a lot easier to unbreak e. g. LVM setup or OEM install modes on our images when the regression happened just a day before than discovering it two days before a milestone is due, as again nobody tests these less common modes very often. So as a result, I really think the investments into QA and the stable+1 teams already paid off twofold by giving us more time to work on the less critical fixes, avoiding lots of user frustration about broken upgrades, and generally making the daily development a lot more enjoyable. Or, as Rick Spencer puts it: Velocity, velocity, velocity!

Despite these improvements, there are still some improvements I’m looking forward to in the next cycles: Thanks to Colin Watson we can now use -proposed as a proper staging area, and used this feature rather extensively in the past month. From my point of view, 90% of the remaining daily dist-upgrade failures were due to packages building on different architectures at vastly different times, or failing on some, but not all architectures (“arch skew”). This is something you cannot really predict or guard against as a developer when you upload large and potentially harmful packages directly to the development release, so uploading them to the staging area and letting everything build there will reduce the breakage to zero. This was successfully demonstrated with Unity, GTK, and other packages where arch skew pretty much always causes people to hose their desktop, as well as daily CD images not working.

I’m also looking forward to combining the staging area with lots of automatic tests against reverse dependencies (e. g. testing the installer against a new GTK or pygobject before it lands), something we just barely tipped our toes in.

I can’t imagine how we were ever able to develop our new releases the old way. 🙂

Precise Pangolin^W^WUbuntu 12.04, I’m proud of you! Go out and amaze people!