MPU Meeting Tuesday 22nd May 2012

Simple KVM Service

Alastair has looked at the options for migrating VMs from one host to another. It turns out that hot migration is very easy to do. It worked well with moving 10 VMs from northern to piccadilly. The paths to the storage have to match on the origin and target hosts but this can be hacked around fairly easily with symlinks. The set of VLANs available on the hosts must also match and care has to be taken that the LVM volume sizes for the guest are the same. The whole process only takes a few minutes for a 5GB VM and the actual downtime was just a few seconds. The migration process needs documenting. We now need to decide where the KB VMs should normally live.

Cold migration is not so easy as there are no helpful utilities to simplify the process.

At some point we should look at CPU pinning. We wondered if that might help the LCFG slaves go faster, Alastair will investigate. We also need a documented recipe for how to increase the memory size of a VM using either virsh or virt-manager.

We discussed the idea of having each server provide a web page which summarised disk space usage/availability. We also need to consider the longer-term capacity planning.

Server Upgrades

The DR server sauce has been upgraded to SL6. We agreed that it would be good to redo the various DR testing to ensure it all still works correctly. We should at least test the following:

  • test slave using sauce as a master server
  • test client using sauce as a slave server
  • test client using sauce as a package server

Stephen is planning to upgrade the AFS packages server telford fairly soon as this will give us some experience of using an 1.6.1 based AFS fileserver on SL6. We should try to get the storage moved off the IBM array at the same time, Stephen will talk to Craig.

Server Hardware

Not much has happened on this project due to holidays and work on upgrading sauce. A Dell engineer responded to one of Chris' blog posts regarding problems we've experienced with the various management software.

It appears that dmidecode will give us the BIOS versions for most things we're interested in managing. The full output is not easily parseable but it is possible to query individual fields.

Security Enhancements

Stephen has been working on the audit component. He has now fully documented all the resources. Also a "scan for setuid files" option has been added that will search the filesystem for any executable files which are setuid. It works hard to never forget anything it has previously found. Whilst attempting to enable the auditd via the standard rc script problems with the boot component were encountered.

Miscellaneous Development

boot component problems
Whilst working on the auditd component Stephen had problems with enabling the daemon via the standard rc script. It appears that the boot component does not close the file descriptors used by ngeneric before running scripts. This means that control never returns to the calling boot component (and thus the client component) which leads to a stuck LCFG client. There is clearly a need to make the entire chain more robust with timeouts to ensure each stage never blocks forever. The first step is to fix the invocation of rc scripts from the boot component.

LCFG server and SQLite
Kenny and Stephen have done some work on switching the dependency tracking in the LCFG server over to using SQLite. For certain workloads this has the potential to be much faster but so far with full rebuilds it has not been any quicker. We would need to look properly at the tuning of SQLite. The new code is much easier to read and maintain, it provides a greater level of data integrity and it is much much faster for doing things like dumpdeps. We are not planning at this stage to roll out a stable release with these changes. We would want to do the changes in a better way which allows the same code to be used for the other databases.

pkgforge bash completion
Chris has added bash completion for the pkgforge command.

Operational

  • northern FC : The FC has been setup for northern which is the second VM server in KB. It has been done in the same way as piccadilly.

  • wordpress and cookies : Alastair has checked and wordpress only sets cookies for anonymous commentors. For now the code has been hacked to display a warning but that change needs packaging properly. We wondered if we could just block anonymous comments.

  • mpu bugs : We agreed that the best approach will be for Stephen to get the list of all MPU bugs and then split them into groups so that each of us can review a smaller manageable chunk.

  • SL6.2 reboots : The SSH server hogwood needs a reboot to complete the SL6.2 update, all others are now done.

  • inventory improvements : It would be nice to have the profile.group resource value mapped into the inventory component resources. We would also like a new sysinfo.role resource so that we can stop abusing the profile.comment resource.

This Week

  • Alastair
    • Produce T1 report
    • Document hot-migration and migrate back northern guests
    • Document cold-migration
    • Investigate CPU pinning - can this be done on fly without guest reboots?
    • Add instructions on how to increase memory (from cli)
    • Check and deploy Stephen's boot component fixes
    • Reboot all spare circle VMs

  • Chris
    • Provide figures for T1
    • Test sauce functionality (LCFG master, RPM slave etc)
    • Time LCFG slave with CPU pinning to dedicated core (with Alastair)
    • Server hardware project
    • Make cookies message on bugs.lcfg.org more visible

  • Stephen
    • Provide figures for T1
    • Upgrade telford (incl perhaps migrating SAN data)
    • Continue with security project
    • Produce report of LCFG bugs for MPU
    • Reboot hogwood for SL6.2 kernel
    • Process other units' responses about their perl-AFS module usage (which functions etc)
    • Speak to Graham about Theon work
    • Cookie warning on wiki.lcfg.org

-- AlastairScobie - 22 May 2012

Topic revision: r4 - 29 May 2012 - 08:48:11 - StephenQuinney
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies