MPU Meeting Tuesday 12th June 2012

Simple KVM

The new KVM server - jubilee - is now in service. The SAN volume has been created on the IBM array but it is not yet available as a storage pool. The plan is to first take the IBM array out of service so the firmware can be upgraded. Now is a good time to do this as currently only metropolitan is using this array. Regarding performance, the local storage pool on jubilee seems to be fine for now. This server is ready to go as a stable server, before we announce general availability Chris will check things through to ensure it's all correct.

The other new machine which was intended to be used as a KVM server is damaged, we are still awaiting Dell organising a replacement.

The kvmtool script now has a setmemory command for changing the size of RAM for a VM. This will only work when the guest is stopped. We could look at memory ballooning at some point if this becomes necessary.

The CPU pinning functionality has been investigated, it can be done on the fly, we need to document the process.

The plan is to update the final report and put it forwards for sign-off at the development meeting in July.

SL6 Upgrades

There is no more nexsan equipment attached to the Forum fibre channel switches. This means that we can now upgrade the packages AFS server telford. Stephen plans to do the upgrade this week.

Server Hardware

Chris has been working on scripting the management of the inventory of installed firmware versions. He has also been looking at omsa version 7 which promises to provide better support for doing this sort of thing. Right now it doesn't install cleanly but Dell are planning to fix this soon.

Security Enhancements

Stephen has been working on generating reports for events associated with changes to files in the monitored parts of the filesystem. He had one last attempt to write a Perl XS function so that the audit library functions can be called directly. The final conclusion was that although this would provide performance improvements the API is just too complex and awkward for our needs. He has found a simpler route which involves using ausearch and then formatting the output into something parseable using aureport. He now has a Perl script which generates a list of changes to files which are not owned by RPMs, it needs a bit of refinement but should hopefully be done soon. He will then take a look at generating a few other reports, in particular setuid executable script usage and attempts to insert kernel modules should be monitored. These scripts are intended to spot the interesting/unusual events amongst the huge lists of events which occur each day. We might need to tweak the auditd rules to reduce the noise level in the future. To get full details of all changes to the filesystem we will probably use something like aide to generate daily summaries which can be consulted when something odd happens.

Misc Development

lcfg-boot
Alastair has applied the patches to the boot component to improve the way rc-script methods are called. This is being left on develop machines for a couple of weeks to check that it doesn't break anything. We also need to check that installs work correctly with this in the dice installbase.

IPMI on the Dell R720
Alastair and Ian have got IPMI support working for the new Dell R720 servers.

toohot
Chris has been working on converting the toohot script into an LCFG component. Stephen suggested that the sensor number cpp macros would be better placed in the lcfg-level headers rather than just being in dice.

lcfg server analysis
Various work has been done to work out the best hardware configuration to make the lcfg slave servers go faster. It would be good to compare the running of an lcfg slave using the local pool and the SAN pool on jubilee whilst there is no other load. It would also be good to compare these results with running an lcfg slave directly on the host itself.

Operational

circle VMs
The VMs on circle which needed rebooting after the SL6.2 upgrade have all been done now.

atabeast in the Forum
The atabeast in the Forum is now directly attached to central so that SL6 machines can access fibre channel devices.

sauce
Chris has fixed the dr.pkgs virtualhost configuration so that the updaterpms.rpmpath resource can be simply altered from using cache.pkgs in the case of a DR event.

New package cache
We need to get the new package cache server into service in AT to replace split.

Next Meeting

The next MPU meeting will be at 2pm on Tuesday 19th June.

This Week

  • Alastair
    • Update final report for KVM service and consider whether service catalogue entry required
    • Test install of develop machine (to check new boot component)
    • Document hot-migration and migrate back northern guests
    • Document cold-migration
    • Document CPU pinning
    • Add R720 document on MPU page
    • Work through LCFG bugs

  • Chris
    • Check jubilee config etc
    • Make toohot usable at LCFG level
    • Server hardware project
    • LCFG timing with jubilee

  • Stephen
    • Upgrade telford to SL6
    • Continue on system security project
    • Reboot hogwood for SL6.2 kernel
    • Process other units' responses about their perl-AFS module usage (which functions etc)
    • Speak to Graham about Theon work

-- AlastairScobie - 12 Jun 2012

Topic revision: r5 - 18 Jun 2012 - 15:39:02 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies