MPU Meeting Tuesday 19th August 2014

Virtual DICE

Chris has finished the final report and it has been reviewed. The project will be submitted for completion at the Development Meeting on 3rd September.


There is now support for handling reboot requests from LCFG components when they are started at boot time. This is done use the lcfg-checkreboot tool which is called from the lcfg-rebootcheck.service unit. It still needs to have some logging to the console so that it is obvious to the user what is happening.

Stephen commented that we need support for starting components at other times when they are newly added (similar to the way that the current boot component can do this when the configure method is called).

Alastair has decided to ignore the systemd preset stuff for now. If we start the systemd component after updaterpms then this should not be an issue anyway.

SL7 LCFG port

The installfixups script has been done.

The updaterpms tool has been modified to look for the newer hdrs/_n-v-r.a.rpm form of the header file before falling back to the .n-v-r.a.rpm form. This should give a small performance boost since nearly all our repositories now use the newer form.

gdm needs PAM configuration files, Stephen suggested starting with the standard files provided in the gdm package and replicating them using LCFG resources. Chris will take a look.

Stephen is still working on the disk usage survey. The script to collect the data and send it via email is now running on all develop machines. He is working on the data analysis and taking the chance to play with the new JSON support in PostgreSQL 9.3 to see if it is useful.

Alastair is looking at the systemd targets and working out the dependencies between LCFG components.

Miscellaneous Development

The SL6.5 upgrade is ready. The update will go out to desktops after the stable release on 27th August. We need to agree a timetable with the other units for the server upgrades. Stephen needs to check that we have the correct version of libvirt pinned in the dice/options/kvm-server.h header. We agreed that we should upgrade the kernel at the same time.


MPU reboots
bakerloo is done, hare got firmware updates (disk update failed to apply) as well as kernel update, jubilee has been rebooted and had a BIOS update

SSH servers
Stephen suggested that we deploy the "new" SSH servers (brendel and schiff) as SL6.5 rather than bothering with rebooting hogwood and kubelik. He will do the staff SSH server next week.

KVM hot migration
Chris will update the documentation regarding extra dissk.

epel6 repository
The split of the epel6 repository has been done, there was a small amount of fall-out.

piccadilly and northern
We need to finish the migrations, in particular we need to prioritise clearing the SAN space so that it can be reclaimed by the Services Unit.

This Week

  • Alastair
    • Order a spare 600GB disk for waterloo (hot spare)
    • systemd project
      • start writing in blog
        • document the debugging including stuff about disabling graphical boot
      • Modify lcfg components/rc scripts list as a result of COs talk.
      • Consider how components will work with systemd
      • Start designing a systemd target structure for LCFG components
      • Add support for configuring presets - I think running systemd after updaterpms is sufficient - consider thinking about this. Perhaps link all existing preset files to /dev/null and override mechanism?
      • Investigate whether systemd will start a new service if you've added the new service file and the systemd component has therefore been daemon-reloadedno it doesn't
      • Consider whether systemd component should restart the machine if there are config changes? - yes (configurable behaviour). Now only rebuilds config if resources have changed.,
    • EL7 project
      • consider dependencies between components and ordering
    • Add more memory to Forum KVM servers? - 700 per server to upgrade 64GB -> 128GB- (which ones?)
    • Look at iplimit for
    • RT tidy
    • Reboot central - scheduled Sunday 24th 9am
    • Investigate DL180 bonding issue - bug in bnx2 driver has reappeared - suggested fix is to disable msi (and update BIOS and bnx2 firmware)
    • Think about how to take DICE EL7 work forwards - and which bug tracking software to do this? What did we do in previous platform upgrades? Discussed at CEG
    • migrate piccadilly (starting with SAN based guests)

  • Chris
    • Virtual DICE
      • take for signoff
    • EL7
      • look at gdm, including pam config
    • url shortener
    • Reboot hammersmith (and waterloo and oyster next week)
    • update KVM server docs wrt dual disked guest migration (if appropriate)
    • look at scanner reports
    • Make start on 3yr equipment spend plan
    • migrate northern guests (starting with SAN based guests)

  • Stephen
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • EL7
      • Survey what disks sizes we have and what is partition current usage (eg /var, /tmp) - continue
      • PXE install
      • openssh patches
      • Continue thinking about functionality - following week
    • Schedule SL6.5 release
    • Install schiff and brendel as replacement ssh servers.
    • Reboot telford and budapest
    • Write up daily security checks
    • Think about PD
    • Look at scanner reports
    • RT tidy

-- AlastairScobie - 19 Aug 2014

Topic revision: r10 - 27 Aug 2014 - 10:11:57 - AlastairScobie
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies