MPU Meeting Thursday 22nd December 2016

Inventory

Carol and Lindsey have been testing the ii tools and have provided some feedback. Chris has tested the API and spotted some missing functionality.

MPU SL7

computing.help
The staging and backup servers are now on SL7, still need to kill off the old VMs. Will upgrade drupal and then do upgrade the master server to SL7.

wake
Chris asked for some help with the apache config.

bugzilla
Version 5.0 has some dependency issues related to the Email::Sender module, Stephen will take a look. For 4.4 there are some minor dependency issues which can probably be fixed by using a filter in the specfile.

LCFG master
The rfe daemon is now working properly, it needs local versions of the Authen::Krb5 and GSSAPI modules. Quite a few ancient defaults packages have been dropped. Others have been moved to the dice level to reduce the maintenance effort involved. Similarly a few common requirements have moved from dice to lcfg level to help Kenny.

INF-level LCFG test server
There is a new INF-level test LCFG server (master/slave) on polecat, this will run alongside the old SL6 on barents so we can spot problems on either platform. This was a useful stepping stone to getting the LCFG master upgraded since it allows us to easily check for missing defaults packages.

LCFG DR server
Upgraded to SL7. Another useful stepping stone on the way to upgrading the LCFG master. A few problems had to be ironed out with the apache config due to the complex mix of services ( LCFG master and slave, package slave) it provides.

Log cabin
django packages updated for SL7, various dependencies have been built and a test service is now running on circlevm0.

Miscellaneous Development

dsu
The support for dsu is now finished and it is included in all appropriate hardware headers.

fail2ban
There were some dependency problems with the lcfg-fail2ban package and the latest version of fail2ban, we eventually got them fixed for both SL6 and SL7.

lcfg-defetc-el7
Updated to add ceph user and group with ID 167, closes: bug #980

portreserve
Enabled this on SL7 to help prevent some services stealing ports from the LCFG client.

python 3.5
There is now a software collection header which provides python 3.5

Operational

IBM array
There are now volumes for the Forum KVM servers, girassol is ok, cannot succesfully PV create on giavota, azul is not using the correct LVM UUID scheme - needs recreating, will add a note about this to the LCFG profile.

package mirror
We need some more disk space for the epel mirror on juice, either new server or buy new disks.

NX
We need to get jubilee and hammersmith installed as NX servers, one of them should go to the AT server room first.

AT power down
We need to ask about the VMs on waterloo as to whether we can just leave the machine off until the following morning. Also need to remember to remove wildcat from the cache.pkgs DNS entry so the lack of the server doesn't cause lots of updaterpms timeouts at boot time.

LCFG slave move
There is a new LCFG slave server, named leonardo, which is on better KVM server hardware.

rootmail
We have checked rootmail there does not appear to be any excessive noise from MPU machines.

nvidia
All nvidia graphics drivers have been updated.

virtualbox
The latest version of virtualbox will be installed early next year before the start of semester 2.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Continue work on RESTful API - TartarusRESTAPI
      • Document REST API
      • Further encourage people to use API and ii commands
      • Write more of the ii commands and document as writing.
      • Speak to George about macaddr/space feed
      • Start work on final report!
      • Chase Tim about theon acccess credential for feed
      • Convert from mod-auth_kerb to mod-auth_gssapi (See Stephen for details)
      • How represent VMs
    • Deploy encrypted /tmp and swap conversion script
      • Deploy as soon as possible
      • Need to warn users that Gnome3 may pop up a window about /tmp being full (when script is run)
    • Schedule MPU meeting to discuss systemd ordering
    • submit polkit bug to redhat - with Stephen
    • MPU SL7
      • Chase Toby again about testing latest perl-Moose under prometheus (and then make live) after October 1
        • Toby reckons now fine - will update immediately after Xmas
      • Upgrade computing.help servers
        • Kill off hjaelpe and brent (now powered off)
        • Upgrade drupal on hilfe
        • Replace hilfe with new SL7 master 'lagun' ready to go any time
        • Can't do any more until stable of 11th Jan
        • Remember proper certs for computing.help master
      • Consider whether ordershost could move to bandama
        • waiting on Stephen finishing rfe support for SL7
        • Create own VM for this - 'bandama' is still just a development server
        • Pretty well ready to go on 'nerano'. Will just do the switch one evening.
    • Check sysmans (et al) have 'nograce'.
    • Take a look at RT #78875
    • Finish setting up IBM array volumes for Forum based KVM servers
      • gaivota refusing to 'pvcreate' the new volume - claims not found or filtered, but can't see obvious reason why - investigate.
        • can't see any obvious reason why this isn't working. Best investigate next time the machine is rebooted
      • azul - PV/VG wasn't re-created for SL7, so not using the new UUID scheme. As a result, the component isn't running properly as it can't find the PV (see blog item). Because the component isn't running, we can't create a VG on the new FC volume (using the component). -- Record all this in azul's profile. Create the new volume group manually.
        • Fixed by temporarily removing ap1 from lvm.vgs - in fact, perhaps we should remove this from lvm.vgs permanently?

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • pkgsearch for SL7 -> activities list
      • reimplement as a yum web front end (yum search for keyword produce an html file of links to cgi to do yum info)
      • Need support multiple platforms
    • MPU SL7
      • wake.inf.ed.ac.uk (with Stephen's help re x509/cosign)
      • Continue with bugzilla (preferably v5 based)
    • Roll out fixed sleep code
    • Reschedule MPU futures meeting
    • With Alastair, setup IBM array volumes for Forum based KVM servers
    • Replace waterloo lcfg slave with one on one of the KB KVM servers
    • Ask other units if any of their waterloo guests need restarted on AT power out evening
    • Update PackagesSiteMirror

  • Stephen
    • Inventory project
      • Try REST api
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • LCFG client refactor stage 2
      • testing and documentation
      • blog article (once documentation complete)
    • Investigate kernel component pipe moan by using shell commands instead of RPM module => waiting on 7.2 => activities list
    • LCFG server symlink to exam branches - produce reporting script and discuss with Graham
    • Circulate dmesg proposal -> activities list
    • submit polkit bug to redhat - with Alastair (check under 7.3)
    • SL7 MPU
      • Schedule LCFG master server
      • Schedule juice upgrade (first week in new year)
    • Investigate George's multiple network interfaces SL7 issue (eg consoles server)
      • waiting on George breaking metropolitan
    • LCFG annual review - produce minutes
    • Replace piccadilly and northern with hammersmith and jubilee (NX service)
      • Physically move one to AT
    • Delete DICEMpuServersAutoReboot

-- AlastairScobie - 22 Dec 2016

Topic revision: r13 - 24 Sep 2019 - 13:50:24 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies