MPU Meeting Tuesday 15th February 2011

Software Build Farm

Stephen has been writing more documentation for the various parts of the PkgForge system. This has lead to the discovery of a few bugs. He has also been preparing to give the technical talk to COs on Wednesday.

A bug was found with the config in LCFG where the PkgForge daemons were being started at system boot-time before the openafs client which lead to them failing immediately.

There is also now a CGI script for browsing the PkgForge modules perldoc. This is much better than statically generating html files from the POD as it doesn't require any maintenance effort.

SL6

Stephen has setup the site mirrors for SL6 and epel6. Stephen and Alastair will hold a planning meeting on Monday 21st February to discuss how the project should be run. The previously used project plan will be used with some tweaks. Stephen asked if Alastair could set up the package buckets now so that he can put together SL6 mock configurations for use with PkgForge.

We should look at setting up naturaldocs for our LCFG headers so that they can be used to, at least, indicate support for SL6.

Project Allocation

The project allocations for this third of the year have been agreed.

Alastair will be working on the SL6 port.

Chris will be finishing the practical submission project for RAT. After that is done he will move on to the Further AFS automation project.

After finishing the build farm project Stephen will be working on the SL6 port. The hope is that Simon will be free from prometheus work by the beginning of April at which point the LCFG server refactoring project will be restarted.

Miscellaneous Development

LCFG sleep component
Chris found the cause of bug#327 where the wake alarm not being set correctly. Occasionally an alarm would be set in the past, the script now checks for that case.

LCFG openafs component
Stephen has added support for specifying the AFS client cache size in terms of a percentage of the partition size. This is necessary before we can switch to using large AFS cache partitions.

LCFG build tools
Gordon found bug#386 with the email address handling in the author field of the LCFG build tools metadata file (lcfg.yml). Occasionally a single email address becomes split into two. Stephen has looked at it but has not yet found the exact cause. He has updated LCFG-Build-PkgSpec to use the Email::Valid and Email::Address modules when handling the author fields (this is the same as PkgForge). He suspects that LCFG-Skeleton also needs updating.

LCFG build tools
A request (bug#387) has come in to support package signing when building packages with the LCFG build tools. Stephen reckons this should not be too difficult to add.

LCFG build tools
LCFG::Build::PkgSpec has been updated to use native Moose traits to remove the dependency on the deprecated MooseX::AttributeHelpers module.

LCFG xfree component
Alastair has finished porting the LCFG xfree component to F13. As this is no longer a core component there is a new options header which is included by all the monitor headers.

LCFG mysql component
Alastair has ported the LCFG mysql component to use the new build tools.

LCFG boot component
Gordon has made a change to the LCFG boot component run method to use the -q (quiet) flag.

Operational

LCFG DR server
Stephen has added the LCFG master and slave configurations to the DR server (sauce). He is a bit concerned about the complexity of the configuration, in particular that we have combined the configurations from 3 separate services so any changes made to each could have knock-on effects to the combination. He will add a note to each header making it clear that caution must be exercised. Also, currently the DR slave server is using the DR master files directly but Stephen is not entirely happy with this situation as it makes the configuration different from a normal slave server. Having chatted to Kenny he plans to switch to just modifying the server.fetch resource so that the slave uses rsync to fetch files from itself which are stored in the standard locations. We also need some notes on the MPUDisasterRecovery page about what to do with the server in the event of a crisis.

F13 kernel
RAT still have not made a final decision on whether to support condor on F13. It was decided that we would not allow this to hold up the update of the F13 kernel any more. We will roll out the latest standard F13 kernel and the new version of openafs with no local modifications. If we need condor support at a later date we will then add the patches.

DIYDICE server
This was missing the mp-unit header so there was no nagios monitoring. This has now been fixed.

MPU shopping list
The MPUShoppingList has now been completed.

MPU Pandemic docs
The update dates have now been updated for the MPU entries on the PandemicPlanning page.

afs-utilities
Gordon has been looking at the freespace script which Stephen had written as part of the afs-utilities package. It contains a bug which affects the parsing of the quota usage when the user is near their limit. It is also lacking a manual page. Gordon will also look at what other, externally authored, scripts we might be able to add to this package.

LCFG slave using ramdisk
Graham and Chris has tried running an LCFG slave with the entirety of /var/lcfg in a ram disk. This seems to make it about 20% to 30% quicker for a complete rebuild. The process became CPU bound as well which suggest it was working as fast as possible (rather than IO bound as normal). It is definitely worth considering using this approach but just targetting the cache DB files and XML files. We would want regular syncs to physical disk so that when we reboot the data can be reloaded rather than having to rebuild the world. This was done on a desktop machine, we would like to try this on a server. Chris will use a spare HP DL180 to see how fast it could be. We might want to increase the RAM requirements for the new slave servers if this works. We could also try using an SSD to see what performance that gives.

This Week

  • Alastair
    • Meet with Stephen to firm up SL6 project plan
    • Start some work on SL6 (eg package buckets)
    • Discuss natural docs with Kenny (for SL6)

  • Chris
    • Borrow a DL180 from services and try Graham and Chris's in memory LCFG speedup idea
    • practical submit project
    • Look at RT 52016
    • Chase RAT about F13_64 package missing errors

  • Gordon
    • Packaging up afs tools
    • Identify component to write

  • Stephen
    • PkgForge talk and docs
    • Meet with Alastair to firm up SL6 project plan
    • Add comment to pkgslave.h lcfg-slave.h etc headers that after any mods to these files, the DR machine (currently sauce) should be checked for correct operation
    • Redo DR lcfg slave config
    • Update F13 kernel and openafs

-- AlastairScobie - 15 Feb 2011

Topic revision: r5 - 22 Feb 2011 - 09:41:42 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies