MPU Meeting Tuesday 12th April 2011
Software Build Farm
Graham carried out some stress tests of the build farm using
matlab. This is a 2GB source package which generates packages that are
nearly 1GB in size. This revealed that the AFS caches were far too
small on the various servers. To improve matters the build servers
have been reinstalled with much larger AFS caches. The master server,
ardbeg, still needs to be reinstalled with a larger AFS cache.
Stephen has also made a few tweaks to the code to reduce the amount of
unnecessary copying of source files which should help with big
packages. Those changes have not yet been shipped, hopefully they will
go out soon.
SL6
The PXE installer now uses a native SL6 install image for both
architectures.
We now have two SL6 build hosts -
leeds for i386 and
kilmartin for
x86_64. They will need to be reinstalled as proper dice machines at
some point but it's a good starting point for other units needing
access to SL6 for development work.
Most of the work to transfer MPU SL6 resources from the inf to dice
layers has been done.
There are a few medium priority components still to be ported. After
that the next important stage is to add the server support. The lack
of nagios support is going to be a bit of a nuisance. We should offer
to help Inf Unit with the porting of the nagios component to the new
LCFG build tools.
We should start testing SL6 on our various desktop models. Stephen
will talk to Carol and see if she has time to do some testing.
Alastair will publicise the SL6 upgrade to everyone in the School.
AFS automation project
Chris has got a test cell installed -
spaghetti.inf.ed.ac.uk - this
has a DB server and two file servers. He has been playing with
creating, destroying and moving volumes to learn how it all works. For
time monitoring purposes we will count this stage of the project as
personal development.
Craig has added another desirable facility, we would like to have
support for dynamic quotas. This will go on the list, it needs to be
prioritised.
Miscellaneous Development
- sleep component
- The idea of checking for running cron jobs was discussed. Chris has been looking at the Proc::ProcessTable module. We decided that the best approach was to read the PID from the cron daemon pid file and then scan the process table for any process with that PPID. That should catch all normal running cron jobs.
- Memory for brendel
- Alastair has ordered more RAM for brendel so we can do more testing of storing all the LCFG server data files entirely in shared-memory.
Operational
- Server Reboots
- We have now done all the MPU server reboots.
- telford
- Still need to look at the (possible) problems with the local RAID controller/disks. We might also want to look at applying any BIOS upgrades.
- Packages volumes
- The Services Unit need to move the packages volumes to a new file server with more space. We will need to coordinate with them so that we turn off the refreshpkgs daemon whilst the move is happening.
- satablade
- This needs to be switched to using dhcp before we can move it to the AT machine room.
- kernel cpt
- The new support for listing kernel modules associated with a source package is now available. This should reduce the number of reboots required after an install or kernel upgrade.
- AFS cache size
- The default AFS cache size has now been changed to 8GB.
- DR server
- Stephen has setup a mirror of the LCFG install ISO images on sauce. He has also added a mirror of the SL6 repositories and tidied the profile so that all the DR config is in LCFG headers. We need to document how to restore the various services in the event of disaster. We also need to switch sauce to the stable release.
- updaterpms patches
- Alastair will take a look at Stephen's patches for updaterpms.
This Week
- Alastair
- SL6 - server functionality
- SL6 evangelism
-
Development meeting activity page for SL6
- Add memory to brendel and test LCFG compilation (both with/without shm)
-
Convert satablade to using DHCP
-
Look at Stephen's patches to updaterpms component
- Consider personal development topics
- Chris
- AFS automation project
-
Development meeting activity page for AFS automation
- Consider personal development topics
- Stephen
- SL6
-
Ask Carol to do desktop testing
-
Development meeting activity page for PkgForge
-
stabilise sauce
- Consider personal development topics
--
AlastairScobie - 12 Apr 2011