MPU Meeting Tuesday 7th June 2011


Not much happening with this project now. All essential packages must be in the stable package buckets by the end of Friday 10th June so we can make the first SL6 stable release on Wednesday 15th June.

AFS automation project

The script to promote read-only copies of lost volumes has now been finished and manually tested. Chris would still like to add some code-level testing as well.

The next task on the enhancements list is "Script to automate distribution of volumes across servers" There were some thoughts that the script could balance volumes across the servers based on space requirements and speed requirements for busy volumes. This could be a rather complex problem to solve involving many factors. There is a tool on Russ Allbery's website which can do something like this using a "linear programming optimizer".

We wondered whether it might be much simpler to do the balancing just based on space requirements with some human input regarding what types of storage arrays were appropriate for certain volumes. For instance, we could manually manage a list of busy volumes which must be on RAID10 SAS disk arrays. It may also be that certain volumes have to stay on certain servers.

As part of this it would be good to have a database of usage statistics so that we can see how busy a volume is over time and the rate at space is being consumed.

LCFG Server Refactoring

All the outstanding patches stored in gerrit.inf have been reviewed by Simon and have now been submitted. Not much needed changing, mainly some work needed to be done to create a sub-class of the Module::Build class to improve handling of various file types.

The LCFG server component was split off and is stored in the LCFG subversion repsitory. This needs to be updated for the new build tools and tested.

It will be good to get the server daemon running for a while with live data to see how well it functions. We might need to limit the number of profiles being processed to avoid memory leaks.

The next development stage is to work on the patch for the Safe mode usage. There is a patch but it hasn't been thoroughly tested or submitted for review yet.

Miscellaneous Development

We have decided to fix a particular version of SL5 and not continue to do updates on that platform. Some of the DICE policy has been moved to the LCFG layer to make the sleep component more useful by default. The blacklist support which was discussed at the LCFG Deployers Meeting has not yet been added. Chris wants to test the SL6 sleep support for the Dell 755 with different "quirks" to see if it can be made to sleep properly.

At the LCFG Deployers meeting we discussed making SCSI disks the default for SL6 as currently it's using the old IDE hda devices which make no sense with modern kernels. We also need to update the default partition sizes and provide macros for altering the sizes in a standard way.


SL5 kernel
There is a new SL5 kernel (2.6.18-238.12.1.el5), is it time for us to be upgrading our SL5 DICE machines?

We now have an HP DL180 for MPU usage so fantoosh can go back to the Services Unit.

Dell Optiplex 790
We will be buying the Dell Optiplex 790 this year for our standard desktop. There is still a question over which processor we should buy.

ssh firewall holes
Stephen suggested that DICE desktop machines which have ssh firewall holes should have a header added to their LCFG profile (e.g. dice/options/desktop-ssh-access.h) which, as well as creating the firewall holes, would add fail2ban and do anything else we feel is necessary to tighten up access restrictions. This would make it much easier to track desktop machines which have ssh firewall holes and in the event of an emergency we could close them all much more easily.

This Week

  • Alastair
    • Split off NaturalDocs work from SL6 into mini devel project
    • Propose Simple KVM service project
    • Start looking at fstab miniproj
    • Discuss MPU taking over mailcap and alias components with RAT and services respectivelyAgreed.
    • Discuss IPV6 under SL6 with RAT - can we put it back?
    • purchase disks for northern and metropolitan

  • Chris
    • Talk to Craig re what is possible in timescales wrt volume balancing tool
    • Investigate and rectify "false errors" during install and boot
    • More SL6 sleep work

  • Gordon
    • upgrade to SL6
    • mpath component

  • Stephen
    • Server project work
    • Webmark form for TA bidding
    • Try RHEL6.1 on circle to see if fibre problem fixed.

-- AlastairScobie - 07 Jun 2011

Topic revision: r6 - 13 Jun 2011 - 13:33:49 - AlastairScobie
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies