MPU Meeting Tuesday 23rd February 2010

AFS Component

A few tweaks were done to the component. George reported that it didn't check for the existence of the cache directory. It had also been noted that the cache directory size was slightly too large for our current partition size. On Linux the component now uses the Filesys::Df Perl module to check the partition size and limits the cache size to 90% of the available space. The DICE openafs client headers were also modified to only change the cache directory location and size when the separate cache partition has been requested.

Another AFS DB server has been switched to the new component. This time it was afsdb0, which has moved from scylla to skoll. This just leaves afsdb2 on symplegades which should be done this week.

All DICE machines (both servers and desktops) now have the new openafs component installed. They will switch from the old, afs, component to the new, openafs, component when they reboot. At some point in the future we can completely remove the afs component.

Four of the fileservers are running with the new component - crocotta, bunyip, pyrolisk and cockatrice. If all is well the rest will be done in a few days time. The addition of the new component means we now have nagios monitoring of the file servers.

LCFG Server Refactoring

Nothing much happened.

Server Hardware

Stalled.

Installroot

Alastair has been working on create tools to package the kernel and initrd for both PXE and ISOLINUX. Rather than do a complete fork he is adding the support to the lcfg-buildinstallroot and lcfg-mkinitramfs tools. lcfg-mkinitramfs now creates two packages and lcfg-buildinstallroot uses one of these rather than a separate kernel package. Stephen mentioned that the PXE installer packages have to be noarch so they can be installed on an i386 server. He also noted that it needs to be possible to path in /tftpboot for the PXE installer images so development/testing images can be installed alongside the stable versions.

As part of this work Alastair added an option to updaterpms to allow installing packages without any checking of dependencies or running of scripts and triggers. This means that we no longer need a separate tool to strip these out of the kernel packages before putting them into the PXE installer packages. We agreed to keep this secret...

F12

Chris has been blogging about his progress on the F12 project.

The problems with the lcfg-client not receiving the update notification UDP packets was discussed. Stephen suggested using wireshark on the F12 machine to see if the packet was being sent. Alastair suggested turning off the local firewall as that seems to be the most likely blockage. On a related point Stephen suggested getting rid of NetworkManager and just putting an appropriate configuration file into /etc/sysconfig/network-scripts/.

There is now a top-level bug to simplify tracking the F12 project progress.

Miscellaneous Development

LCFG Server
Stephen needs to finalise the changes he made to the LCFG server code and finish testing. The update should also be passed to Kenny for testing on his profile set.

upstart
Alastair has looked at upstart on Ubuntu and confirmed that it is a complete mess. The code and configuration data is also totally intermingled on that platform. On F12 it is still running in a compatibility mode so we can just pull out all the rc jobs and use the boot component. We shouldn't (hopefully) need to configure it in any way.

Operational

SL5.4 update
Now on the stable release. A few small problems, but nothing serious.

SL5.3
The kernel and afs packages were not held back for those machines sticking with SL5.3. This was fixed in an update to the stable release made on Friday 19th February.

DICE repository config
This has now been changed.

pkgwrite repository access
Alastair has set up access to the packages repositories for the pkgwrite user. Stephen has modified his mirror cron job on telford and it all works nicely for SL5.

RAID config testing
Alastair has tested RAID10 on an evo array and it is significantly faster than RAID5 for installing and upgrading multiple VMs in parallel. He has also setup the bpbeast array to provide 2 RAID1 arrays using 22 disks and tested using that. With that configuration he has seen a large spread of time taken for installs.

VMware server hosts
These have all been held back at SL5.3

lcfg-boot
Stephen submitted a broken version of lcfg-boot to the develop release. This was related to an inconsistency in the APIs for GetSysInfo and GetSysPath and how only the latter supports exporting short named variable names. Stephen will try to remember if there was a good reason to do things this way. In the mean time he will fix the boot component to use the long name versions.

This Week

Alastair will:

  • Consider periodic file configures
  • Think about MPU logging requirements
  • Monthly Project reports
  • Talk to George about routing problems
  • Think about F12 package buckets
  • Fix pkgsubmit for F12
  • installroot work

Chris will:

  • Consider periodic file configures
  • Think about MPU logging requirements
  • Monthly Project reports
  • F12
  • Stall hardware project

Stephen will:

  • Consider periodic file configures
  • Think about MPU logging requirements
  • Monthly Project reports
  • Server refactoring
  • Fix lcfg-boot
  • Finalise lcfg-server changes

-- StephenQuinney - 23 Feb 2010

Topic revision: r2 - 25 Feb 2010 - 11:58:02 - StephenQuinney
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies