MPU Meeting Thursday 18th December 2008

Buildtools Project

A couple of extra features were added to reduce the number of times when it is necessary to write a CMakeLists.txt file for a package. The build tools now automatically create the directory /var/lcfg/conf/$compname for each component. They will also automatically install any file matching *.tmpl or *.tt in a templates/ sub-directory or a file named template in the top-level directory for the component. These could, of course, also be processed with CMake at build-time so might have a *.cin file name suffix.

Additionally, quite a few more components have been converted to using the new build tools.

Power Management Project

No progress.

LCFG Component Testing Project

No progress.

Miscellaneous Development

To fix the stability problems with Xen Alastair tried turning off the nmi_watchdog option which is the only thing we do differently to the standard SL5.2 kernel boot config. This seems to have improved things a bit to the point where it was possible to run multiple VMs concurrently, do big kernel compiles, etc. but it has not completely eradicated the crashes. In particular, there are still lots of problems with running DICE on DICE related to creating a new virtual install.

The new R805 servers have arrived so Alastair plans to get them into the racks and starting doing testing on there. This will get around the problem of only having one test machine which continually needs to be reinstalled to test out different options.

Chris mentioned MLN, which Paul is quite keen on. We're not sure exactly what capabilities it has, we probably ought to take a look to see if it is any better than the standard Xen scripts.


RT Config
All 3 RT servers are now using the same LCFG header to pull in the huge list of extra Perl modules which are required. The rest of the configuration (e.g. apache) could be merged but there are quite a few subtle differences so the required time and effort might not be worth it.

New PXE service
This is now live on the RPM cache servers. A couple of problems cropped up with the beowulf cluster and also servers being installed via a serial console. These will be fixed as part of the stable release on 18/12/08.

installbase package list
The only remaining LCFG component package needing to be moved to a separate options headers is lcfg-xinetd. This change requires the modification of quite a few headers so Stephen is planning to leave this until after the Christmas break.

DIY DICE and inv resources
Paul is not currently using DIY DICE so Stephen has cleaned out his old LCFG profiles along with lots of others which were clearly unused. All active users of DIY DICE have now removed the inv resources from their profiles.

sysinfo resources
We still need to remove the remaining sysinfo resources from the LCFG profiles.

Dell Optiplex 760
This has now arrived, Stephen is unlikely to have time to start testing until after the Christmas break.

Chris has created an LCFG template for bugzilla. The LCFG instance will be brought up as, Chris has created a live header to hold the configuration for now, Stephen will look through to check it can be safely deployed to dresden alongside all the other services. As this service uses SSL we will need another IP address and network interface on dresden. It looks like it should be fairly easy to transfer LCFG-related bugs from the Informatics bugzilla to the new LCFG bugzilla. At some point we will also look at upgrading the Informatics bugzilla instance.

trondra resurrected
Dell have provided a new power supply so trondra is now back in service as an LCFG slave server. arbirlot is no longer being used as an LCFG slave and needs to be reinstalled before going back into service to ensure all the LCFG data is deleted.

IPMI modules
Stephen has added an lcfg/options/ipmi.h header which does the necessary work to start IPMI. Currently this is only being used by the hardware headers for the Dell PE 1950 and 2950 machines, we should be able to use this on all modern Dell server hardware.

Now they're using the i810 driver, webots displays OK again on the HPs - it's slow but it works. There may still be a problem with 745s. Graham is going to look at the issue of whether machine swaps between labs would help ensure that webots works where it's required to work. The other problem, of crashing, now seems to happen independently of whatever hardware is involved. Webots always seems to have been prone to crashing anyway and this latest bout of crashing is nothing particularly new. There's a new version of webots which Graham is investigating. So far it seems to be a lot more stable.
There was a serious security hole in the version (4.2.0) of TWiki we were using, this was patched straight after the announcement. Since that point a new version (4.2.4) has been released which included lots of other minor bugfixes and this is now being used.

nvidia headers
Stephen has reorganised the headers for the nvidia graphics driver to try and make them a bit more sane. Currently they work in a backwards compatible manner but once the changes have been through the stable release and we've had a chance to modify any affected LCFG source profiles we will switch to the new behaviour.

openafs 1.4.8
The latest stable release of openafs goes out to the stable machines on 18/12/2008. As recommended by Simon, all the client packages are marked as "boot-time-only" so that they are installed at the same time as the kernel modules. Lab machines will automatically reboot on Friday evening and office machines the following Wednesday night.

Subversion & webdav & websvn
Stephen has been working on updating the LCFG subversion service to using webdav. This has the advantage of being faster and allowing iFriend users to have access. He has also been looking at deploying websvn for browsing the repositories. We will probably provide browsing access to a read-only copy of the LCFG configuration data repository which is kept synchronised with the real repository. There will also be a new repository for LCFG components, we need to work out the best method of handling trunk, tags and branches and implement subversion support in LCFG-Build-VCS.

FC6 end of support
We no longer have any active FC6 DICE machines so we can finish weekly testing and decommission bradford, the build host.

Default partition sizes
We are starting to run out of space in the root partition on desktop machines (which are 25GB) when they have all the research and teaching packages installed. Stephen will look at the disk sizes for the various desktop models currently in service and come up with a new size which will last for a couple of years. We should also look at increasing the swap partition size now that our latest standard desktop models have 4GB of RAM.

qlogic FC cards
We used our last spare PCI-Express FC card in dresden to replace the dead card. We should order some more spares. The failure in dresden of a Dell-supplied FC card also raised the question of whether we should order machines with Dell qlogic FC cards or should just buy the cards separately.

This Week

Alastair will:

  • Organise VMWare mini-talk
  • Dual-path fibre channel support
  • Virtualisation
  • Roaming DIY DICE docs

Chris will:

  • Power management project
  • Decommission arbirlot and bradford

Stephen will:

  • svn/webdav and websvn
  • Look at default partition sizes
  • Look at Dell Optiplex 760
  • Finish lcfg/installbase package list changes

-- StephenQuinney - 19 Dec 2008

Topic revision: r1 - 19 Dec 2008 - 09:53:17 - StephenQuinney
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies