MPU Meeting Tuesday 24th January 2017

Inventory

Nothing happened.

MPU SL7

computing.help
We have discovered a problem with journald on lagun, for some reason the /var/log/journal directory is missing. Alastair tried reinstalling which didn't fix the issue, for unknown reasons it magically reappeared after the recent KB power outage. We need to investigate further and also check if other machines are affected.

ordershost
This has been moved off the LCFG master server onto nerano

LCFG master
Stephen proposes upgrading the LCFG master next Tuesday (31st January), he will email COs and raise the issue at the Operational Meeting.

inf-level login VM
There is a new SL7 inf-level login VM - mouse - which is available for COs to test software.

Miscellaneous development

SL7.3
The lcfg-level headers and package lists are pretty much finished, the dice-level package lists are partially done.

Moose
The Moose packages have been upgraded on SL7.

Build tools
Stephen fixed an issue with the build tools having hard-wired paths for the svn and cvs commands, this does not work on MacOSX where such applications are installed in /usr/local/bin. We now rely on the PATH.

Operational

  • NX service : The new NX servers have been installed and are ready to into service. The memory limits have been raised which should be appreciated by the users.

  • Bonding issue : The file names used for the network bonding slaves in /sys/class/net/bond0/ have changed from slave_ to lower_. This doesn't actually break bonding but it does break our check_network script for nagios passive monitoring. Stephen has modified the script to check for both variants.

  • dracut bug : We have been affected by a dracut bug which caused a couple of machines to fail to boot. This is due to some important modules not being added to the initramfs when it is rebuilt, this appears to be rhbz#1405025. We now have a locally patched version of dracut but existing initramfs files will need to be rebuilt if we want to guarantee that machines can be safely rebooted.

  • libvirt : The version of libvirt on SL7.2 has gone from 1.2.17 to 2.0.0. Our KVM service is not affected since we pin the version. We need to look at the changes and see if any of them will cause us problems.

  • VM migration : Chris has had some very slow VM migrations from gaivota to azul. He will test the write speed to the SAN pool on azul with a simple dd.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Continue work on RESTful API - TartarusRESTAPI
      • Document REST API
      • Further encourage people to use API and ii commands
      • Write more of the ii commands and document as writing.
      • Speak to George about macaddr/space feed
      • Start work on final report!
      • Chase Tim about theon acccess credential for feed
      • Convert from mod-auth_kerb to mod-auth_gssapi (See Stephen for details)
      • How represent VMs
    • Deploy encrypted /tmp and swap conversion script
      • Do during Festival of Creative Learning week (w.b. 20th Feb)
      • Need to warn users that Gnome3 may pop up a window about /tmp being full (when script is run)
    • Schedule MPU meeting to discuss systemd ordering
    • submit polkit bug to redhat - with Stephen (check with 7.3)
    • Write an clientreport script to report on existence of /var/log/journal
      • built and shipped to develop
      • ship to all (via live header) if no probs on Wednesday eve
      • 27/01/17 - Two machines - mole and leonardo - have no /var/log/journal (out of 700)
      • 31/01/17 (still at 07/02/17) - One machine - badger - has no /var/log/journal (out of 822). (mole and leonardo were rebooted part of KB power problem)
    • MPU SL7
      • Upgrade computing.help servers
        • Kill off hjaelpe and brent (now powered off)
        • reinstall lagun and make live
          • reinstalled (with NOSUID removed).
          • Remember proper certs for computing.help master
      • Remove ordershost config from steen
    • Check sysmans (et al) have 'nograce'.
    • Take a look at RT #78875
    • Look at RT and SL7RT
    • Look at differences for new 7.3 libvirt (2.0.0)

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • MPU SL7
      • Continue with bugzilla
      • Look at wake backend (running on Inf servers)
    • DICE encryption
      • Continue thinking and researching
    • Roll out fixed sleep code
    • Reschedule MPU futures meeting
    • Update PackagesSiteMirror
    • Schedule gaivota downtime to investigate LVM/IBM VG issue
    • Look at differences for new 7.3 libvirt (2.0.0)
    • Look at RT and SL7RT

  • Stephen
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • LCFG client refactor stage 2
      • testing and documentation
      • blog article (once documentation complete)
    • LCFG server symlink to exam branches - produce reporting script and discuss with Graham
    • submit polkit bug to redhat - with Alastair (check under 7.3)
    • SL7 MPU
      • Schedule LCFG master server
      • Create inf level login host
    • Continue with SL7.3
    • Investigate George's multiple network interfaces SL7 issue (eg consoles server)
      • waiting on George breaking metropolitan
    • LCFG annual review - produce minutes
    • Make both new NX servers live
    • Look at differences for new 7.3 libvirt (2.0.0)
    • Look at RT and SL7RT

-- AlastairScobie - 24 Jan 2017

Topic revision: r10 - 24 Sep 2019 - 13:50:24 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies