MPU Meeting Tuesday 28th March 2017

Inventory

Alastair is still working on the test suite, now concentrating on SQL comparisons. He's been developing a procedure for making a repeatable set of test results. He went to in a talk on a Perl test framework at the spring conference. It's work in progress but interesting. He'll also be using Devel::Cover to improve test coverage.

MPU SL7

The test SL7 version of LCFG bugzilla is ready apart from:

  • LCFG branding. Chris will fiddle with CSS.
  • iFriend. Chris will test it, especially Stephen's address translation module, and will give LCFG users an opportunity to test it too.
  • Every now and then it stops responding. More data needs to be gathered. Chris has enabled Nagios monitoring in hopes that this will help pinpoint the problem.

Chris has tested wake-backend.h and thinks it should work on SL7.

LCFG Client refactoring

Stephen's new Bison grammar is now working.

He has finished documenting the code and the API. He has mapped the C library stuff into Perl as needed. He still needs to rewrite setctx which sets pending contexts.

Most aspects of the project are nearly finished.

Additional disk encryption

No activity.

Miscellaneous development

There's some systemd news:

  1. Alastair's new development version of the systemd component makes symlinks instead of empty files. It hasn't yet shipped, but it has been tested with the installroot and works there. As a bonus, the new version seems to fix the problem whereby getty didn't work on tty1.
  2. Stephen needs a target for after everything has started, so that he can use it to ensure that blocking kernel module loading is the very last bit of the boot process.
  3. Alastair has compared our systemd configuration with that on maipo, our native RHEL 7.3 box, and has found some extra services that we should probably be starting. He'll circulate a list.

Operational

Stephen has finished the MPU's last few computing.help page reviews, so we've finished that.

We have some new servers.

  1. deneb is the new packages server, taking over both sites.pkgs.inf.ed.ac.uk (our mirrors of other sites) and http.pkgs.inf.ed.ac.uk (master server for local packages). The sites.pkgs.inf.ed.ac.uk function transferred without fuss from juice. The http.pkgs.inf.ed.ac.uk was more awkward to transfer (from bruegel) because squid on the package cache servers was confused by having both sites.pkgs.inf.ed.ac.uk and http.pkgs.inf.ed.ac.uk resolving to the same address, periodically returning status 403 instead of the expected packages. However this was solved by giving deneb a virtual second interface. (Thankfully Apache 2.4 can easily be configured to listen on two interfaces.) So, juice and bruegel have been freed up. The amount of disk space on juice makes it ideally suited to replace the ancient budapest as the OpenAFS build host, ridding us of another dependency on the SAN. bruegel will replace the elderly schiff as our main SSH server.
  2. altair and vega are the two new LCFG slaves. altair is up and running in the Forum. vega will be installed in AT. Neither has yet replaced the current VMs rembrandt and leonardo. So far altair is about 20% faster than rembrandt but it should be faster still, judging by our tests on a Lenovo P310; we'll investigate some more. Stephen tried taking out the RAID controller but that made virtually no difference to the profile compilation speed.

The former KVM and NX server northern has been junked, and its sibling piccadilly is about to be.

oyster is next in the KVM server shuffle. Its VMs will be moved to the remaining three Forum-based KVM servers, after which it will be moved to AT and reinstalled. It'll then act as decant space for waterloo so that that can be emptied and reinstalled. Both will then provide space for new VMs to replace MPU VMs currently in JCMB, hopefully creating enough space there to allow for the emptying and reinstall of both JCMB KVM servers.

Stephen found that it was easy to upgrade the iDRAC firmware using the machine's Lifecycle Controller. Just disconnect from the serial console, connect up a physical console, then boot into the Lifecycle Controller, follow the prompts to configure the network interface (to use DHCP on IPv4), then tell it to do a firmware update via ftp, accepting whatever defaults are offered. The update process is similar to using dsu - it finds out for you which updates are needed; you choose them, then click Apply; it downloads and applies them. The iDRAC was fine after its firmware update, presumably since it wasn't in use while the update was applying.

It's about time we had some new root passwords. We talked briefly about how we might make new ones. Chris will write up some possibilities and put them to COs.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Continue work on RESTful API - TartarusRESTAPI
      • Document REST API
      • Further encourage people to use API and ii commands
      • Write more of the ii commands and document as writing.
      • Speak to George about macaddr/space feed
      • Start work on final report!
      • Convert from mod-auth_kerb to mod-auth_gssapi (See Stephen for details)
      • How represent VMs
      • Continue with REST API testing framework
    • Deploy encrypted /tmp and swap conversion script
      • Need to warn users that Gnome3 may pop up a window about /tmp being full (when script is run)
    • Schedule MPU meeting to discuss systemd ordering
    • submit polkit bug to redhat - with Stephen (check with 7.3)
    • Think how to regularly report on machines with no /var/log/journal
    • Check sysmans (et al) have 'nograce'.
    • Take a look at RT #78875
    • Look at RT and SL7RT
    • Look at /etc/hosts - dns issue
      • work out what we need to fix current problem
    • Project blog about inventory
    • Circulate info on RH7.3 systemd changes we may wish to consider
    • Produce a report on machines with wrong time (using clientreport)
      • module written - will hit stable on Wed 29th March
    • Contribute to MPU SL7 final project report
    • Fire up 'muro' for LCFG performance testing (to compare with altair)
    • MPU Review percentages on PD/OPS/Projects for T1 so far

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • MPU SL7
      • Continue with bugzilla * check iFriend access * Add nagios monitoring to spot the hanging
      • Look at wake backend (running on Inf servers)
    • DICE encryption
      • Continue thinking and researching
    • Roll out fixed sleep code
    • Reschedule MPU futures meeting
    • Look at RT and SL7RT
    • Think about whether we can use NX service for staff.login/student.login
    • Produce PXE boot image so that we can update the BIOS of HP 800 G2s
    • Contribute to MPU SL7 final project report
    • Machine moves (piccadilly, new LCFG slave)
    • Next KVM shuffle
      • empty oyster onto remaining Forum KVM servers
    • Consider new 'root' password(s)

  • Stephen
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • LCFG client refactor stage 2
      • testing and documentation
      • blog article (once documentation complete)
      • complete package support
    • LCFG server symlink to exam branches - produce reporting script and discuss with Graham
    • submit polkit bug to redhat - with Alastair (check under 7.3)
    • Investigate George's multiple network interfaces SL7 issue (eg consoles server)
      • waiting on George breaking metropolitan
    • Look at RT and SL7RT
    • Think about whether we can use NX service for staff.login/student.login
    • Draft a position note on shell components under SL8 and possible ways forward
    • Produce some text for systemd mount bug (to submit to RH)
    • Start SL7 final project report
    • Contribute to MPU SL7 final project report
    • Machine moves (piccadilly, new LCFG slave)

-- AlastairScobie - 28 Mar 2017

Topic revision: r4 - 24 Sep 2019 - 13:50:24 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies