MPU Meeting Thursday 18th August 2016

Inventory

Alastair has improved the documentation for the REST API. There are now examples. Please try it out and let him know how you get on.

LCFG Client Refactoring

Nothing this week.

MPU SL7

SSH
Both ssh servers have been upgraded. There were only minor issues.
autofs on SL7
Stephen spent some time investigating lcfg-autofs on SL7 and found the following issues:
  • There was no explicit systemd start ordering for the daemon and the component which led to an unpredictable race situation where either systemd or the component could start the daemon. It appears that everything only functioned correctly when the component "won". Fixed in revision 51434.
  • The autofs daemon needs to be able to make DNS lookups for the LDAP server. This normally works fine but on servers we may only have the localhost nameserver listed in resolv.conf. That can lead to boot time issues when autofs starts before bind is ready to handle requests. Fixed in revision 51435.
  • The component uses the ngeneric IsStarted method to decide whether or not to restart the daemon. At boot time when the component is not yet started and the daemon has already been started by systemd the component consequently does not do the necessary restart to have the configuration reloaded. There's a patch pending which should fix this. See Bug:967.
Staff NX
staff.nx was upgraded to SL7. The first install attempts failed at the creation of the hostclient principal due to incorrect nameserver addresses being returned by dhclient. When this was fixed, a restarted install went to completion. Some time later a few odd symptoms were noticed: KDE sessions could not start; systemctl could not run. These were fixed when the polkit service was started. It has not yet been established why it was not running. While debugging this we found that any restart of the dbus service has to be followed by a restart of systemd-logind (Red Hat 1271394).
Package cache and PXE servers
These have several aspects:
  • TFTP now works on SL7. It now uses systemd rather than xinetd.
    • This brought to light a bug in TFTP (Red Hat 1023645) - it cannot listen for IPv4 and IPv6 on the same port. Unless instructed to listen for just one or the other, it defaults to listening for only IPv6. We got round this by restricting it to IPv4.
  • PXE is working too.
  • Squid eventually worked once the unhelpful changelogs had been negotiated - see the comments in SL7RT#11.
PowerEdge R720
This had been incorrectly passed as working, when it lacked support for modern network names. This was added, fixing the bonding problems on one machine.
KVM guest migration on SL7
We wondered last time whether it was still the case that KVM guests, having been migrated from an SL6 host to an SL7 host, then power-cycled, cannot then be migrated back to SL6. It is - Chris checked this on an SL7 KVM server running the latest qemu-kvm-rhev packages.
SL7 KVM guest on SL7 KVM server
Creating an SL7 KVM guest on an SL7 KVM server (currently only metropolitan, gaivota and girassol, all of which are currently still test machines) should be done with kvmtool --flavour sl7-sl7.

Miscellaneous Development

SL 6.8
This is now looking solid and ready to use, so it will be made the default on the develop release on Monday.
Dell System Update
We've been trying out new ways of finding and applying firmware updates to Dell servers. The most promising so far seems to be dsu. For the story so far see RT:78685 and live/dsu.h.

Operational

VirtualBox 5.1 broken update
The latest VirtualBox update left some machines unable to run VirtualBox. Stephen is rolling out a fix for lab machines, which were rebooted before the problem was spotted. Any other machines suffering from the problem can be fixed with:
  • /usr/lib/lcfg/conf/kernel/scripts/vbox install
  • systemctl start vboxdrv
Updated qemu-kvm-rhev packages
These packages were updated this week. They're used on the SL7 KVM servers in place of the qemu-kvm equivalents. As hoped, Red Hat's rhev-watch-list mailing list advised of the update.
journald and fastbugs
The SL7 journald problem affected rabbit recently. It's due to a bug which has been fixed in a more recent version of journald which can be found in the fastbugs repository. Until now we haven't really used fastbugs, but there's no reason why we shouldn't. We intend to add it to the default updaterpms.rpmpath. This would be far less silly than occasionally having to copy desired packages from fastbugs to the world bucket.
SATA on Lenovo P310
We had wondered what decided which of the hard disk and the SSD became sda and which sdb. Alastair has discovered that whatever device is connected to the first SATA connector gets sda.
Multi-headed HP G2
Barry O'Rourke in Physics has finally got an HP G2 working with two monitors - by using an NVidia card and disabling the onboard graphics.

This Week

  • Alastair
    • Inventory project
      • continue working through InvProjectWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Continue work on RESTful API - InvProjectRESTapi
      • Continue populating new inventory so other folk can play with API
      • Start work on final report!
    • Remove default pool if ops meeting agrees
    • Dump 'atom'
    • Deploy encrypted /tmp and swap conversion script
      • Deploy on office desktops MONDAY 15th AUGUST
      • Need to warn users that Gnome3 may pop up a window about /tmp being full (when script is run)
    • Schedule MPU meeting to discuss systemd ordering
    • Reschedule MPU futures meeting
    • Continue building computing.help honeypot
    • package up ILW stuff and document process
    • submit polkit bug to redhat - with Stephen
    • Chase Toby again about testing latest perl-Moose under prometheus (and then make live) after August 15th.
      • Toby has tried. Prometheus isn't happy. Toby reckons traits behaviour has changed.
    • After next kernel update - Run named existence report on bandama
    • Continue researching whether 'discard' or fstrim is appropriate/possible for cryptab partitions
    • Once Stephen updated DNS part, submit SL7 server base project to August devel meeting for closing
    • Look at MPUActivitiesList
    • MPU SL7
      • Try bringing up an SL7 test server akin to 'otter' - package slave export
      • Chase Toby again about testing latest perl-Moose under prometheus (and then make live) after August 15th.
        • Toby has tried. Prometheus isn't happy. Toby reckons traits behaviour has changed.
        • Ask Toby if the above problem is on the server or the client (if just the server we can upgrade the base version and pin the version on the server)
    • Add need LCFG compiler analysis / benchmark to MPUActivitiesList
    • Reinstall 'muro' with hard disk as root and then Chris try benchmark again (wait for next stable release)
    • Add documentation on ssd-disk.h on LCFG wiki
    • Add looking at cgroups for NX service to MPUActivitiesList
    • Try 'dsu' on metropolitan
    • Upgrade some machines to SL6.8 (eg zip)
    • Join rhev-watch list
    • Test encryption script (need to reinstall a non encrypted desktop first)
    • Check sysmans (et al) have 'nograce'.

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • pkgsearch for SL7
      • reimplement as a yum web front end (yum search for keyword produce an html file of links to cgi to do yum info)
      • Need support multiple platforms
    • MPU SL7
      • Investigate KDE problems on staff.nx (SL7)
      • Schedule some migrations to new SL7 kvm server
      • Schedule upgrade of student.nx server
    • Investigate R730 iDRAC with Ian D
      • firmware upgrade - play more with 'dsu'
    • Look at MPUActivitiesList
    • Add SL6<->SL7 KVM migration info to MPU wiki docs on virtualisation
    • Figures

  • Stephen
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • LCFG client refactor stage 2
      • testing and documentation
      • blog article (once documentation complete)
    • Investigate kernel component pipe moan by using shell commands instead of RPM module => waiting on 7.2 => activities list
    • LCFG server symlink to exam branches - produce reporting script and discuss with Graham
    • Circulate dmesg proposal
    • Apply firmware patches - circle
    • submit polkit bug to redhat - with Alastair
    • SL7 MPU
      • Continue work on package caches (PXE server and NFS to go)
    • Work on RT tickets
    • Add something about DNS to FinalProjectReport356
    • Look at MPUActivitiesList
    • Document BIOS settings for Lenovo box
    • Update circle to SL6.8 and play with 'dsu'
    • Check hardware model headers to make sure all models support new network naming scheme for SL7
    • Add fastbugs into updaterpms package path
    • Figures

-- AlastairScobie - 18 Aug 2016

Topic revision: r11 - 23 Aug 2016 - 13:08:06 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies