MPU Meeting Thursday 21 January 2010

rpmsubmit

Re the coordinated repository switch-over:

  • Kenny has done his bit
  • Alastair has changed the LCFG level
  • Stephen has fixed the Inf level
  • Alastair will check his home setup, then change the DICE level.

AFS Component

Another database server will be done next week. The new KDC machine is now installed at KB though not yet configured as a KDC. Craig will tackle fileservers at some point.

TiBS

Chris will close the first TiBS project at the February development meeting, or will get Craig to do it anyway. Chris will outline what he thinks should be in the follow-up TiBS project and pass this to Craig.

LCFG Server Refactoring

Stephen has just found time to restart work on this. The code clean-up progresses steadily. All changes can be followed via gerrit and Kenny is doing this. Testing is slow and fiddly and slows things down markedly, partly because of spanning maps hassle. There's interest in hearing more about the project so Stephen will talk about it at the February LCFG Deployers' Meeting.

Server Hardware

Holiday hijinks in the server room have focussed our minds on the issue of temperature monitoring. If we could configure machines to monitor the ambient temperature with IPMI then shut themselves down cleanly if it goes above a critical level then that'd be a great emergency fall-back. Currently machines just cut their own power off when the temperature gets critical, meaning no clean shutdown. We'd still proceed with other measures to monitor temperature and take action, but the availability of an automatic clean shutdown as a fallback would be very reassuring. This now seems more important than RAID drivers so will be tackled as the project's top priority. Chris will ad this to the project description then get the project passed at the February development meeting. As a first step it'd be interesting to try to find out more about the reported IPMI temperatures, for instance to see if we can establish a safe ambient temperature, or find out how the temperature varies across models, or by physical location in the server room.

Installroot project

Alastair will put forward a proposal for this to the February development meeting, although it may turn out to be finished by then and to have taken less time than the 2 week minimum for officially tracked projects. The idea is to redo the installroot to get rid of the NFS root. Alastair has got a prototype solution working. It's similar to the current arrangement except without any hacked kernels. He had hoped to use unionfs but that's not in Fedora, and it's hoped to make this solution applicable to Fedora as well as SL/RHEL. As an alternative to unionfs, the needed bits are copied into a ramdisk leaving the rest of the root partition available read only. Alastair's doing something akin to what udev does to probe the hardware and find out which modules to load then load them. This gives you the correct setup for the machine's pci, scsi etc. This kills off both the old pci setup and kudzu too, which is great. Now Alastair needs to reimplement the prototype properly.

The next big LCFG port

There are rumours to the effect that SL6 will be announced in June, months after we'd hoped for it. However it could be later as there have not yet been any test releases. Fedora 13 is due on May 11, with alpha releases in March and beta in April. It therefore looks very like our port this spring will be to Fedora 12. Chris will propose the Fedora 12 port to the February development meeting.

Misc Development

om changes
"om" is used in the installer and isn't working there because of the absence of the perl AFS module. Stephen will change his code to try to require the perl AFS module but carry on if it's not available; this now seems like the only possible solution even though it was rejected before.

Operational

SELinux package update problem
Stephen has fixed this. THe problem was with a postinstall script of the util-linux package: it's a known problem that this can't install into a chroot. Stephen's fix was to pull all selinux packages out of the installroot completely as they're not required by anything there.

randomising the boot.run time
Stephen has changed the boot.run cron job time for DICE desktops to a random time between midnight and 6am. This seems to have made a big difference: the recent Maple update caused no big problems whereas the previous one, with all updates happening in the same one hour period, floored us. He will now make this the default for all DICE machines on all wires. We'll tell people that they can override this using the three headers we provided last year.

SL5.4
is now in shape, it's what Stephen has been spending most of his time on. THe package lists are done, and just as significantly, we now have tools which makes the construction of package lists far easier, meaning that we can now far more easily maintain updates to distros than we could before. The SL5.4 packages are in dice/options/test_updates.h and Stephen will add that to develop machines soon. Thanks to the new tools we've now caught up with some updates which we were previously missing so our distro is a more complete and accurate SL than it was before. With this release we're changing from having a package list for the major version (SL5) plus a load of updates, to having separate package lists for each minor version (SL5.4, SL5.3) plus updates. With the new tools this should be an easier and cleaner way to do it. However it's not yet clear how best to configure the choice of minor update in LCFG terms. MPU should think about this. We should also encourage all units to test their services on SL5.4. We'll perhaps raise this for discussion at the next Operational meeting. Time is of the essence, as the earlier we can deploy 5.4 the less the final year undergrads will have latched on to specific 5.3 package versions for their project work.

Meeting Abandoned

The meeting was interrupted by another spate of collapsing servers.

Next Meeting

The next meeting is scheduled for Tuesday 2 February at 10am.

This Week

Alastair will:

  • consider personal development topics
  • finish off the repository restructuring
  • chase George about routing problem
  • move refreshpkgs
  • look at latest VMware Server
  • consider how to configure the choice of minor SL version

Stephen will:

  • consider personal development topics
  • finish om changes
  • submit a unique namespace bug
  • randomise boot.run time for all wires
  • pursue SL5.4
  • carry on with LCFG Server Refactoring
  • consider how to configure the choice of minor SL version

Chris will:

  • consider personal development topics
  • outline the follow-up TiBS project
  • move to close the first TiBS project
  • consider how to configure the choice of minor SL version
  • add temperature monitoring to project proposal
  • propose server hardware project for February development meeting
  • propose Fedora 12 project for February development meeting
  • work on the Server hardware project
  • make the bugs for the F12 port

-- ChrisCooke - 22 Jan 2010

Topic revision: r6 - 01 Feb 2010 - 14:18:42 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies