MPU Meeting Tuesday 14th June 2011


This week's testing release will become the first official SL6 stable release. There's one problem with it: the ffox component. There's a bug in the old version whereby the component stops and loses data when you upgrade from old to new versions. There's a workaround for current SL6 develop hosts, but Stephen and Kenny will come up with a proper fix in time for this week's release.

There's a new version of remctl too. We'll upgrade all platforms to this version soon.

There's also a new version of pam-afs-session but this is a major version difference, so the upgrade will need to be tackled carefully, so won't be done right away.

AFS automation project

Craig agreed on tackling a much simpler balancing system to start with.

Chris has been looking at existing AFS management tools and at the priority list.

Stephen suggested that progress on balancing would depend on (among other things) having good data to work with, especially historical data that might allow trends to show up, and that that might be a good place to start.

LCFG Server Refactoring

The daemon is running on a test box. Everything is fine except for a problem with stopping. This may be being caused by the use of extensive debugging - Stephen will check.

The next priority will be to tackle safe mode.

After that there will be two key areas to tackle.

The first key area will be to replace how the LCFG databases (e.g. the status DB) are managed, with the aim of getting rid of tied hashes. THis means a complete change to the API. Currently the server process can write data to the disk at any point. The change would impose some discipline on this so that for instance after dealing with one complete profile the server would then stop and do all its writes to disk, then start again. This way the server would only write when nothing was changing. This would hopefully make the resulting data less inconsistent than it can sometimes be with the current server.

The second area to tackle will be mutations. Core mutations will become proper perl functions, with cpp remaining merely as a translation mechanism, to allow the continued use of unquoted strings.

Miscellaneous Development

Graham's framebuffer switch makes the 755 sleep reliably under SL6. Stephen and Alastair recommended rolling this fix out as standard at the dice level for SL6 755s, but asking the Deployers' Meeting before putting it in at the lcfg level too. Chris blogged a list of recent sleep developments.
Stephen's painstaking multi-week roll-out of the new auth component continues. He has now updated the PXE installers on all platforms. The new password template is now in develop.
Alastair has been working on hackparts to add GPT support. Should we move to GPT by default? It gives us big disks, 32 partitions and naming partitions.
We need to get rid of the existing messy partitioning arrangements and move to using the new agreed LCFG standard ones which Kenny is coordinating. We won't need to change partition sizes, names etc. though as we can mutate them to what we want.
Stephen and Alastair agreed a new macro which would add (arithmetically) to a value instead of overwriting it. This will come in handy when partitioning disks but could also be useful elsewhere.


  • The DIY DICE server got messed up thanks to the metropolitan glitch at the weekend. It turned out that an rsync process had hung. Is there a timeout option for rsync?
  • Alastair has arranged for the MPU to take over the mailcap and alias components.
  • Alastair discussed IPv6 on SL6 with RAT. It's expected to remain a problem. Bonding pulls in ipv6 on f13 (though not on sl5). Java has an ipv6 problem on all platforms.

This Week

  • Alastair
    • Blog message re SL6 + ask USU to install SL6 desktops
    • Propose Simple KVM service project
    • Continue looking at Java/IPV6/bonding issue - which platforms does the bonding issue affect.
    • purchase disks for northern and metropolitan
    • fstab mini project

  • Chris
    • For DICE SL6 755s, use fancy frame buffer setting
    • Work on AFS automation project
    • Investigate and rectify "false errors" during install and boot

  • Gordon
    • upgrade to SL6
    • mpath component

  • Stephen
    • Ship lcfg-ffox, pref new version, by mid Wed so SL6 can hit stable
    • Webmark form for TA bidding

-- AlastairScobie, ChrisCooke - 14 Jun 2011

Topic revision: r5 - 24 Jun 2011 - 08:37:16 - AlastairScobie
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies