Using bcache for performance gains on the Launchpad Database Servers
by Tom Haddon on 10 December 2015
Launchpad is the code hosting, bug tracking and build system for the Ubuntu distribution itself, and is used by many other software projects, including OpenStack, Inkscape & MySQL.
There are two main data stores in Launchpad. The first is the Librarian, which is a 22TB object store using OpenStack’s Swift as its backend. The second is an 800GB+ PostgreSQL database, and the performance of this is key to the overall responsiveness of Launchpad itself.
Maintaining performance over time as the data set grows for this database has typically been a case of periodically refreshing the hardware to take advantage of the upgrade in RAM and CPU brought by a few years of Moore’s Law.
However when we came to refresh hardware last time around we realised we needed to take a slightly different approach. After looking at a number of options, we decided to use bcache. Bcache is a Linux kernel block layer cache, allowing one or more SSDs to act as a cache for slower hard disk drives.
Our new PostgreSQL database server is composed of 6 disks plus 1 SSD, in comparison to the previous generation of server which had 4 disks in a RAID 10 array for the root partition, plus 25 more disks in a RAID 10 array for /srv where all the PostgreSQL data was stored.
Overview of specific implementation
Our particular workload here is PostgreSQL which is mostly composed of 1GB files on disk, and requires mostly random access reads. As a result, we’ve tuned the server in question by setting /sys/block/bcache0/bcache/sequential_cutoff, congested_read_threshold_us and congested_write_threshold_us for the bcache filesystem to 0.
After some unpromising early experiences with Linux 3.13, the version shipped with Ubuntu 14.04 LTS, research revealed that Linux 3.16 was the minimum viable version for this use case, and so we switched to the Canonical Kernel Team’s HWE kernel.
This has been deployed using a combination of MAAS, Juju, OpenStack and Mojo. The Juju state server was provisioned into our production OpenStack instance to match our other production environments and to give us the flexibility to add other services to this environment using Juju. Given the tight performance constraints of the database servers, we wanted to run PostgreSQL on bare metal, so we used MAAS to install the machines, and then used manual provisioning to add them to the Juju environment.
The deployment itself was done using Mojo, which gives us a predictable and repeatable deployment methodology and also allows us to run daily tests in our CI environment to confirm that if for any reason we needed to redeploy the service from scratch we’d be able to do so.
Some performance numbers
In terms of real-world performance, this has led to a drop in server errors (primarily timeouts) on the launchpad application servers of just over three times, which is a significant improvement, even taking into account faster CPUs and more RAM, especially when considering the significantly smaller number of disks used.
This isn’t the only place we’ve used bache at Canonical. On archive.ubuntu.com we’ve been able to sustain 9Gb/s+ and 20k requests per second from a single machine with 10Gb/s NIC, and for our OpenStack deployments, adding bcache to our nova-compute hosts has greatly increased the number of instances they can run while maintaining performance.
Future Deployments
As of version 1.9 MAAS will have the ability to configure any storage layout during node deployment, including bcache, RAID and LVM. This means future deployments of bcache will as simple as setting some configuration options to MAAS.
Want to know more about deploying bare metal infrastructures using MAAS? Head over to Maas.io to get started with the latest version of MAAS.