On the segregation of KVM guests

It has been suggested that it may be possible in some circumstances for a malicious attacker to break out of a virtual machine into its containing server - from where it could interfere with other virtual machines. This has prompted us to start exploring the idea of segregating virtual machines in some way.

Why segregate?

There's a worry that malware in a KVM guest could affect the guest's host server, and potentially break out into it. This has definitely been an issue in the past (so it's not a ludicrous idea): a quick web search for "break out of KVM guest" turned up this exploit for CVE-2011-1751, which describes how to (at least) crash or (less easily) gain access to a KVM guest's server. The exploit requires a privileged process on the guest. However the same Red Hat update also fixes CVE-2011-1750, which presents a way of gaining privilege on a KVM guest. These problems were patched by Red Hat in qemu-kvm- Although our servers may at first sight appear to be running an earlier version (qemu-kvm- of that package, it is in fact extensively patched and does contain the fixes for these issues. However, since security has been broken at least once, and given the money or kudos that may be made by cracking the right virtualisation systems, it's a fair bet that there are extensive efforts to find more ways to break out of Linux KVM guests.

What to segregate?

Should we house the most critical services separately from the rest? Or should we segregate those felt to be more vulnerable?


Which guests might be described as "critical"? Various suggestions have been made:
  • VMs which hold master data.
  • VMs which hold secret data.
  • VMs which provide an important part of our infrastructure.
  • AFS DB servers.
  • LCFG slaves.
  • Anything to do with exams.
  • Prometheus.
  • Our mail servers.
  • web.inf.ed.ac.uk
  • The main "rfe" server.
  • admin.smb.inf.ed.ac.uk
This would encompass guests using perhaps half a TB of storage in the Forum, and a few hundred GB at both the other sites. (Some of these critical services are spread across several sites.)

Less secure

If we consider it from the other direction, which VMs are less secure or more vulnerable? Here are some suggestions:

  • Those which allow user logins
  • Those managed by users
  • Web servers which allow uploads (rather than just presenting info)
  • Web servers using software with a less than stellar security reputation such as Drupal, Wordpress, PHP

This approach turns up quite a few virtual machines - perhaps a third of the total. At the time of writing it would include VMs totalling 218GB of storage at KB; 135GB at AT, and about 1TB at the Forum.

Critical and less secure

Awkwardly, some services appear to fall into both categories - mail and web, for instance, both allow user data uploads and both have firewall holes. Housing them with other critical services might put those services at greater risk?

Segregating critical guests may make more sense

There is no clear blue water between guests which are vulnerable and others which are not vulnerable. Rather, there seems to be a continuum of risk featuring (for instance) PHP, Drupal, Wordpress, services with user logins, services involving user uploads (CVS, Subversion, git, Bugzilla, etc.), KVM guests where some users control configuration or can access elevated privilege. A dividing line between the more vulnerable guests and the rest would always be fuzzy, moveable and subject to debate. In addition, any failure to categorise a guest's security vulnerability correctly could result in its being needlessly exposed to risk (on the "vulnerable" side of the line) or posing a needless risk to other guests (on the "protected" side of the line).

Faced with this, it would be far easier to simply deem some guests to be "critical", and separate them from the herd. When we separate out "critical" guests, we at least know that such guests will be housed separately from the less secure guests, so should be a little safer - provided that none of them is thought to be more vulnerable to attack thanks to any of the above factors!

Hardware and maintenance

Would it be safe to concentrate our most critical services on fewer machines - perhaps on one machine? If that machine had a sudden hardware problem and went down without warning, we could have quite a hole to dig ourselves out of. The service is run on young, well specified, well maintained machines, so the complete failure of a machine might be unlikely - but were it to occur, it could potentially take down all of our most critical (virtualised) services at the same time.

A more likely scenario would be the need to do planned maintenance involving reboots of the KVM server. Normally when this happens, some guests are migrated to other servers and the rest of the guests are suspended or shut down for the duration of the maintenance. In practice guests are often migrated several days before the maintenance, and migrated back shortly afterwards.

Guest Migration

Migration is a way of shifting a KVM guest without interruption from one host server to another. A few years ago it was a trouble-prone procedure but these days it can be done easily and reliably - although it does currently involve several manual steps, so can't be guaranteed to be trouble-free. Guests are usually migrated for two reasons, both prompted by their host server's need for some downtime. They may be performing a critical function which should not be interrupted, or whose absence could cause widespread inconvenience. It may also simply be less work to migrate a guest than to announce a break in service.

Hitherto, without any guest segregation policy, guests could be migrated to any other KVM server at the same site. (Our network policies preclude cross-site migration.) The introduction of segregation would make this more awkward. Here are some ways in which that migration might be handled:

  • No migration - all segregated guests would be suspended or shut down when their host server needed maintenance. If we were segregating the more critical guests from the rest, a lack of migration, and the consequent need to shut down critical guests, would make the segregated server far less attractive to potential users. Critical guests might well be put on the "unsegregated" servers instead. The need to announce the shutdowns would also make the prospect of organising downtime for the "critical server" distinctly unattractive - assembling all the details for a multi-guest shutdown announcement is fiddly and time-consuming - so the server itself would tend to get less maintenance, less patching of firmware, fewer reboots to bring important OS components up to date.
  • A machine could be set aside to be used as a decant server during maintenance events. If this was used alternately for non-segregated guests then at a later date for segregated guests, it would need to be wiped and reinstalled between times, to cleanse it of any breakout malware from a non-segregated guest.
  • A server could be emptied of its normal guests so that it could house guests migrated from the segregated server. This would be doable but might substantially increase the amount of work involved in maintaining the host server of the segregated guests. It would also increase the complication of the migrations, making a (guest-killing, catastrophic) mistake more likely. When migrating guests, simplicity is key. This solution would also mean the decant server being wiped and reinstalled before accepting segregated guests.
  • During maintenance events segregation would be temporarily abandoned, with segregated guests being temporarily mixed in with the rest as at present. This option would lower security to current levels for a few days. It would also offer the possibility of malware being introduced to the sometimes-segregated guests while sharing with other guests, then being migrated in the guest back to the segregated server. Security-wise, this option does not make sense.
  • Two machines at the same site could house critical guests. They would then act as each other's decant server, to which guests could be temporarily moved to enable server maintenance to take place. (This is how the two KVM servers at KB are used.) This would seem to be the most hassle-free and practical arrangement.

In other words: cheap, secure, reliable: pick any two.

KVM server provision at each site

See also http://ordershost.inf.ed.ac.uk/cgi/kvmreport for the current statistics.

Informatics Forum

At the Forum there are four supported servers. Three have 1TB of guest space and one has 1.5TB. Once the imminent hardware refresh is complete this will increase to two servers with 1TB and two with 2TB.

Storage-wise, we could assign two servers in the Forum to "critical" duty, and two to the rest, provided the latter two were the forthcoming machines with more than 2TB of storage each. However the number of guests served by the two "critical" servers would be way below their capacity (in terms of storage space, cores and memory). The remaining two servers would also be a cause for concern, since they would be hosting the vast majority of the Forum's KVM guests, and might not have the capacity to act as each other's fall-back server in case of problems or maintenance. In this scenario we might well find ourselves needing a fifth KVM server for the Forum.

James Clerk Maxwell Building

We have two KVM servers in JCMB. Each has 2 x 558GB partitions for guest storage. Each server acts as emergency backup for the other. This makes maintenance easy - smooth, non-intrusive, relatively quick and painless. It's difficult to see how one of them could be turned over to "critical" guests, with the other serving the remainder, without making maintenance substantially more intrusive, time-consuming and awkward.

Appleton Tower

At AT there is one supported KVM server. It has a terabyte of guest space. There's also a test server which in practice provides emergency backup from time to time. Maintenance would not be easy, and migration would not even be possible, without that old test server. Arguably we should have a second supported KVM server in Appleton Tower even without any segregation - but we don't, because there doesn't seem to be enough demand to justify it.


As things stand, segregation of guests does not seem to be a practical idea. However it could become more doable if some of the current limitations changed:
  • Networking: if server subnets could be shared across sites, guests could be migrated to other sites. This would make it possible to have, for instance, two "critical" servers (one in IF, one in JCMB) which could act as each other's stand-in when needed.
  • If ECC memory could be had in smaller, cheaper machines - or was deemed not to be necessary? - we could provide enough smaller, cheaper machines to accommodate a limited number of segregated guests. Safe storage (e.g. RAID) would still be a requirement.
  • If guest storage was provided by a suitable filesystem shared between servers (perhaps GlusterFS, GPFS, Ceph?) we wouldn't necessarily even have to worry about RAID disks, because the redundancy would be built into the filesystem. We could conceivably even run guests on a small cluster of desktops.
Topic revision: r5 - 24 May 2016 - 10:03:57 - ChrisCooke
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies