MPU Meeting Tuesday 3rd May 2011

Software Build Farm

The documentation for the database admin has now been written. A schema update was applied and the website code was updated accordingly, this slightly alters the build logs table to make it possible to delete builder entries which are no longer required. Very little still to do now, no more code changes expected (other than fixes if bugs are found). Hope to finish the documentation this week and prepare the project for sign-off. There are a few things which will need doing to improve the day-to-day management of the service, such as adding tmpreaper for old logs and temporary files, mostly these small tasks will be done as operational work as it crops up.

SL6

There were no file editors in the install root or base profiles so Stephen added vi and nano to aid debugging of machines with problems.

We now have working DICE SL6 profiles which can be used to install machines. These include support for most of the extra DICE layer functionality, for example routing. We still need openldap support and a few other inf-unit components before we can really move to using DICE profiles for our build hosts.

We still need to meet with RAT to finalise the package list layout. We also need to discuss at the LCFG Deployers meeting how to manage packages which are part of the SL6 distribution but are not in the standard base list. Mostly this is an issue with how to deal with packages for specific services (e.g. apache or bind).

It looks like it is not possible to sleep the Dell 755 and the HP 7100 under SL6. There is also an issue with SL6 machines not waking up when the mouse is moved or a mouse-button is clicked, a keyboard press still works fine though. This appears to be related to a patch which Redhat has applied to the SL6 kernel.

SL6 has been installed on circle which is an R710.

AFS automation project

Not much happened.

Miscellaneous Development

sleep component
Chris is working on adding functionality to the LCFG sleep component to allow users to enable and disable sleep on their DICE machines.

DIYDICE
Alastair has been working on improving DIYDICE support for F13 and adding support for SL6. In particular this is because Paul needs it for some new servers.

Operational

Site mirrors
There are problems with mirroring SL6 and epel6 from mirrorservice.org, the rsync error messages are rather unhelpful so it's not obvious if the problem is at the client or server end. Also it is not easy to debug the issue as the errors cannot be reproduced when rsync is run manually. Doing the rsync against an alternate mirror provided by bytemark.co.uk also works fine.

satablade
The satablade has been moved to the AT machine room.

IPMI on circle
IPMI is now working on circle, on the R710 the BMC is on a dedicated port but the machine had not been wired up correctly to use this port.

metropolitan problems
The VMWare server metropolitan lost its network twice last week. It has been rebooted to see if that resolves the problems. The load has been creeping up over time and averaging 20 (peaking at around 50) but after a reboot the load was down at about 2. We don't understand why the load grows like this over time. We could move some of our test and low priority VMs to KB, we will wait until new internal disks have arrived and been installed into the servers.

Test LCFG slaves
There are a few test LCFG slaves running at the moment. Now we have finished the hardware testing we should clear out brendel and fantoosh

This Week

  • Alastair
    • Meet with Stephen to decide further SL6 work
    • Meet with RAT to discuss package lists
    • Finish DIYDICE re F13 and SL6 - working well enough. Need to sort out docs sometime and make SL6 the default installroot
    • Look at comparative timings for last week's stable release brendel against mousa/trondra - 40 mins mousa, 30 mins brendel

  • Chris
    • AFS automation
    • Finish SL6 sleep work

  • Gordon
    • multipath component

  • Stephen
    • Meet with Alastair to decide further SL6 work
    • Meet with RAT to discuss package lists
    • Discuss F13 meeting room machines with Carol
    • Remove LCFG test servers
    • PkgForge documentation

-- AlastairScobie - 03 May 2011

Topic revision: r6 - 10 May 2011 - 11:07:59 - StephenQuinney
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies