MPU Meeting Tuesday 1st September 2009

Power Management Project

Tests on the HP DC7900 machines are going well. The next step is to roll out the sleep component to all HP DC7900 machines in the student labs.

rpmsubmit

Alastair has agreed a date with Kenny for making the change to the new RPM master, this will happen on Tuesday 15th September. Backwards compatibility with the old URL scheme will be maintained for a while to avoid problems.

AFS Component

Nothing happened.

TIBs

Not a lot happened but Chris has now got the tarball of the latest version of the tibs software from Alison.

Chris has been blogging about his progress.

LCFG Server Refactoring

Stephen has started work on merging the LCFG Perl module dependencies. This would have been completed but he got stuck fixing more problems uncovered by the test suite. In particular he has discovered that the last_modified_file node is not predictable when there are multiple files with the same modification time which are all considered the most recent. Another code change for the server has been prepared and is waiting review from Simon. Several scripts have also been written to aid the testing process, these include scripts to bundle up all the necessary input files, run mkxprof as a normal user and do the output comparison.

Miscellaneous Development

The update to the LCFG cron component has gone out and a message detailing the changes has been sent to COs. Stephen has also added the same, AUTO time within a range, support to the LCFG autoreboot component.

Alastair has been tidying the inf level headers. He started by removing all the old fc5, fc6 and solaris stuff. He has now introduced 3 flavours: inf which is LCFG with packages from the world bucket; ed which also includes the uoe and ed buckets; and dice which also includes the inf and dice buckets. The inf flavour will be used for regular testing to ensure the exported LCFG headers and package lists are always in good working order. The dice flavour would be used for developing new platforms. Alastair has already discovered one missing package dependency through this work.

Pandemic Planning

We need to work on making our servers more robust.

One step is to use both network devices and add ethernet bonding. Some of our servers already have this but the following need doing: dresden, figgy, mousa, prague, split, telford, tobermory, trondra and tummy. Alastair will do dresden, figgy and split. Chris will do mousa, prague and trondra. Stephen will do telford, tobermory and tummy. All those in the Forum have the necessary cabling they just need the bonding configuring and a reboot. Currently the configuration is done in the source profile, Alastair will add a header and then we will merge the changes into the current testing release so it becomes available for wider usage as quickly as possible.

We also need to check that our service configurations are held in header files and not in the source profiles. This will make it easier to move services to other hosts if hardware problems occur.

We need to work on documenting the management of our services and we should all state the topics in which we have an interest in learning more on the PandemicPlanning wiki page.

Chris has documented the LCFG release process, he also noticed that some of the LCFG documentation is out-of-date, Stephen will get this fixed.

Stephen suggested that he could work on getting the openafs component into service on the AFS file and DB servers as that would vastly improve our ability to manage those servers.

Operational

boot.run times
Chris will come up with a plan for altering the boot.run tmes on the MPU servers and email it out for comments.

This Week

Alastair will:

  • Identify areas of interest DONE
  • Server ethernet bonding DONE
  • Ethernet bonding header DONE
  • Chase George about routing problems
  • Continue work on inf level DONE
  • Pandemic howtos

Chris will:

  • Identify areas of interest DONE
  • Server ethernet bonding DONE
  • TIBS package
  • Organise Perl tutorial trip
  • Email boot.run times DONE

Stephen will:

  • Identify areas of interest
  • Server ethernet bonding
  • Document disaster recovery for LCFG master
  • Update LCFG server documentation
  • LCFG server refactoring

-- StephenQuinney - 02 Sep 2009

Topic revision: r5 - 07 Sep 2009 - 21:44:30 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies