MPU Meeting Tuesday 1st October 2013

Inventory

Alastair will be meeting with the CSOs tomorrow to discuss how the new inventory will be used.

Alastair tried adding indexes to the foreign-key columns but it didn't make any noticeable difference. Maybe the tables are just not big enough for it to be necessary.

Virtual DICE

Chris announced the Virtual DICE image to students and asked for testers. One student had problems with virtualbox, they tried vmware without any success. Eventually they got virtualbox working but they didn't give enough details of the issues encountered. So far the image has had 27 unique downloads according to the apache logs which shows there is at least some interest.

Chris proposed that we use new host names for each release to make it easy to identify. We won't need to make new releases too often just when big changes are made to the RAT software. We're not particularly concerned about minor security updates, most applications such as web browsers should be used in the normal environment (i.e. not the VM). Maybe we should make the next release in early January just before the start of semester 2?

LCFG Client Refactoring

Stephen wrote some tests for the LCFG::Client::Build module which completes the test suite for now. Adding further tests would be very difficult and it's not clear the additional benefits would be worthwhile. He will now write up some instructions on how to manually test the running daemon.

The next step is to put together an LCFG header and then ask for testers. There is a bit of work required to finish off the "generic profile" support in the various client libraries.

Miscellaneous Development

critical-shutdown
The script for doing a shutdown of machines in a particular server room at a specific criticality level has been finished. It now asks for confirmation of the action before going ahead. We need to remind COs to check the criticality settings of their servers.

perl-Sub-Name
The version conflicts for this package have been fixed.

NX
The work on the NX service is now almost complete. We need to add some resource limits (e.g. on cpu and memory) to avoid any single user hogging the whole service. We also need to look at monitoring the resource usage with sar. The computing.help pages need to be updated to recommend the OpenNX client on all platforms.

Operational

KVM reboots
Chris will add the reboot policy onto the wiki

refreshpkgs
There is still a problem with refreshpkgs. It's very unclear what is going on, there is a similar number of files in the inf and world buckets but inf is 4 times larger. Stephen wondered if it might be quicker to generate new files every time rather than attempting to do an update.

pandemic docs
These have been updated. Stephen noted that we need to update the ReleaseManagementProcedures page to mention using KVM in place of booting a real machine from a CD/DVD.

Spending plan
We should move everything on sauce onto juice to avoid needing to replace the hardware next year. We will move the NX service to northern and piccadilly to give us separate staff and student servers.

MPU backups
Stephen will work through our rmirror settings and update them to use the new headers and macros.

dhcpd changes
We all need to look through the proposed dhcpd changes.

This Week

  • Alastair
    • Start Inventory project diary
    • Inventory project
      • Talk with CSOs - principally to ensure have covered every possible state and transition. Also to ensure not overly complicated to use. Possible issue wrt hostnames for dynamic IP self managed machines.
      • Add "fault" type to the history/changelog table - and rename the history/changelog table as "logbook"
      • Submit bug/enh to App::Cmd author wrt option to die on unspecified options
      • Pester George about location API
    • Order a spare 600GB disk for waterloo (hot spare)
    • Ask George - what does the TXretransmit value mean for switch connections?
    • Consider how to make metropolitan usable by users
      • ISOs
      • minimal docs (mostly manual)
      • they'll use virt-manager, but not create machines or change config
    • circulate table of LCFG bugs
    • refreshpkgs - investigate. Take local snapshot and repeatedly run createrepo against.
    • Consider dhcpd component changes Just network component and install system for me. Propose adding a new resource to network component which understands the new tuple - this can be used in place of hwaddr_eth0 and, perhaps, ipaddr_eth0. Not sure about hostname_eth0.
    • Consider activities list

  • Chris
    • Add KVM testing to release testing documentation
    • Continue fleshing out spending plan
    • Speak to Paul about DIYDICE upgrade to SL 6.4
    • Add KVM policy to KVM pages
    • Consider dhcpd component changes
    • Consider activities list
    • Start looking at LCFG -> git project - learn git (under PDP time)

  • Stephen
    • NX
      • Finish tidying up NX config
      • Will look at resource monitoring
    • Make a list of servers using old mirror macros
    • LCFG client refactor
      • manual test for daemon behaviour
      • announce to others for testing
      • report
    • Consider dhcpd component changes
    • Consider activities list
    • Start on python PDP
    • Think about LCFG client - XML modules

  • Carol
-- AlastairScobie - 01 Oct 2013
Topic revision: r12 - 29 Oct 2013 - 09:47:12 - StephenQuinney
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies