MPU Meeting Thursday 4th February 2010

AFS Component

Hopefully the AFS DB server in KB will be moved soon. Craig has confirmed that the AFS fileservers are ready to be switched to the new component.

LCFG Server Refactoring

All the code has been passed through perltidy to make the layout consisten. The major causes of warnings (mainly uses of undefined values) have been fixed. Almost completed code improvements to reach perl-critic level 4. The release level name handling was rewritten. The next steps are to finish the move to building with Module::Build. It has become apparent that the command-line option handling is pretty bad and we need to switch to Getopt::Long to make it more readable. We need to properly integrate the new test system and also add code-level tests. Beyond this we need to investigate possibilities for object serialisation/storage, one option is KiokuDB.

Server Hardware

Chris has written a perl script to look at the ambient temperature sensor using IPMI and shutdown the server at the upper non-critical threshold. Stephen suggested running this from cron rather than having the script running permanently and sleeping between temperature checks. Not all servers (including some new ones) have the ambient sensor, they do have a planar sensor though. Could we use that as a fallback option? Presumably it will give a different temperature, Chris will do his review of machine temperatures again including that sensor to see how much it varies. Chris noted that IPMI hangs on some old machines, we need to ensure we don't run the script on those machines.

Installroot

Alastair has checked that the prototype works under SL5. The only package update required is for upstart where we need the version from F12.

There are some problems with installing SL5 under VMWare related to not seeing dhcp responses when using a bridged network configuration. The current CD installer occasionally has similar issues. There don't seem to be any issues with installing F12.

Alastair wants to talk to IS to see if they have plans to redevelop PIE and if so whether we could share technologies.

F12

As a first step we need to setup a mirror of F12 and get mock configured. We should then be able to build most of the lcfg packages from the SL5 SRPMs.

Miscellaneous Development

We should all review the small projects list before the next MPU meeting.

The changes to om were discussed, Alastair asked if there was any way to avoid the warnings it now generates during the install. Stephen noted that this only affects DICE machines which require the Om::Environment::NewAFSPAG Perl module. Normally if the AFS module is missing this should be considered a problem so a warning is useful when it is missing. Unfortunately there is no way to set the om_defaults.environment resource for the install context in such a way that the context information gets passed to the per-component om_environment resources. The only option would be to add an install-context override of the resource for each affected component.

Operational

refreshpkgs backup server
Once telford has become an AFS fileserver this will be used as the backup location for the =refreshpkgs script.

LCFG component namespace
Stephen has added a bug report about how it would be good to run the LCFG components with better process names.

VMWare kernel problems
There are big problems with the various VMWare products and the latest RHEL5 kernel which has altered an API such that the kernel module doesn't compile. On the guests this could probably be worked around by using the open-source management tools instead of VMwareTools. There is no obvious fix for the host servers though.

SL5.4
A few problems came to light after the develop machines were switched to SL5.4. The filesystem and glibc packages needed to be marked with the reboot flag. The update to perl caused a conflict as it attempted to obsolete a local version of perl-Storable, this could only be fixed by renaming the package to perl-Storable-Local. The new KVM packages which are now available for x86_64 were in the wrong package list, we need to decide on the best home for these, might be worth asking at the LCFG Deployers Meeting.

SL5 minor releases
Stephen mentioned the idea of supporting SL5 minor releases. There are a few possible options, we could parameterise the dice/options/sl5.h header (or more likely do it at the lcfg level), we could add OS headers for each minor release, or we could do a combination. This should be discussed at the LCFG Deployers Meeting.

dice-orders
The dice-orders package on tobermory has been broken since the switch to the new Informatics database. We cannot just remove the package as it provides the ordershost web interface. Alastair will take a look.

dresden disk space
The lcfg.org server, dresden, has been seriously lacking in disk space since the storage array crash. Stephen will ask Craig to sort out some new space.

telford
We are planning to add some new disk space to telford and use this as an AFS file server for the various RPM repository mirrors. Once we have the space we will add a mirror of F12.

FH machine moves
Stephen and Chris will organise the move of mousa and split from FH to AT.

VMWare servers
We need to move guests away from central and bakerloo blob1.

Space on bpbeast
Alastair has 28 disks on the bpbeast to do RAID configuration testing. Currently it is a bit confused but hopefully that will be resolved soon.

updaterpms
Alastair mentioned that it should be possible to disable the updaterpms run method. Stephen asked if the updaterpms component could be changed to not send mail on errors if the test flag was set.

Next Meeting

The next meeting will be held on Tuesday 16th February.

This Week

Alastair will:

  • Review small projects list
  • Think about MPU logging requirements
  • Finish repository restructure
  • Talk to George about routing problems
  • VMWare server hosting
  • pkgwrite access for AFS pkgs tree
  • dice-orders
  • RAID testing

Chris will:

  • Review small projects list
  • Think about MPU logging requirements
  • FH move
  • Temperature shutdown
  • F12

Stephen will:

  • Review small projects list
  • Think about MPU logging requirements
  • FH move
  • Server refactoring
  • dresden disk space
  • telford disk space
  • F12 mirror

-- StephenQuinney - 08 Feb 2010

Topic revision: r5 - 15 Feb 2010 - 14:53:22 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies