MPU Meeting Tuesday 25th October 2016


Alastair has still not had any responses regarding his request for testing of the REST API. The support for getting and setting attributes has now been properly fleshed out and there is a table with details of what is supported. The new command-line tools (ii query/edit) now use the REST API. Alastair is working on adding validation to the supported parameters, this needs to be able to handle the differences between setting and getting a parameter (where extra meta-characters must be supported). It would be nice to extract the business logic from the controller part of REST framework into a separate module so that it can be tested directly without going via REST, this would improve maintainability and reusability. Alastair needs to talk to George about getting a supported data feed.

LCFG Client Refactoring

The new functions added to the lcfg-utils library have been documented. The implementations of some utility functions were improved to make them more robust. The CMakeLists.txt was also improved so that binaries were always installed correctly.

The new packages library now supports reading a package list from a cpp-style rpmcfg file.

Testing revealed a weird issue with loading Berkeley DB files which do not have file-system write permission. It appears they must be loaded with the RDONLY option explicitly specified otherwise nothing is loaded since write-access cannot be acquired.


KVM servers
azul has now been upgraded, Chris will upgrade oyster fairly soon. Stephen noted a problem with a halt of the VM mole apparently never finishing, need to check that the XML is correct. He also noted a problem with the "purge" option for the kvmtool delete command appearing to not work. There is an oddity with the volume paths for some migrated VMs.

Export packages server
This has been moved to jornets and is now live.
Alastair has begun work but has encountered problems with cosign which seem to also affect SL6 installs. He will discuss the issue with Toby and Neil.

The pkgforge builders and master server have been upgraded. Stephen took the opportunity to upgrade to PostgreSQL 9.6 and use some of the new features (e.g. jsonb support). He has also made some minor improvements to the web interface to improve the timestamp handling.

There is a test buzzsaw installation running on the new loghost copernicus. It is currently importing data but not generating reports, the next step is to start comparing the reports with those from tycho.

lcfg slaves
The test slave server vole has been upgraded, Stephen is now working on the diydice server which doesn't need much other than apache 2.4 support. During this work it was discovered that when the LCFG server imports data using rsync it can end up with the wrong owner and/or group on the files. The options for the rsync command (in the server.ropts resource) have been fixed.

packages server
The main packages server juice is ready to be upgraded. This has to be done using the DR server for packages. Stephen will test that this all works before attempting to upgrade the real thing. The upgrade will need to be done at a time which is convenient for other COs.

Miscellaneous Development

virtual dice
The new version is almost ready for deployment.

There is a problem with dsu mounting a volume twice and not doing any unmounts. Chris has reported this to Dell and apparently it will be fixed in the release for November or December. When this happens the hwmon nagios check will complain about read-only volumes.

swap options
There is now a LCFG_DISK_SWAP_MOPTS macro which can be used to set the mount options for the swap partition. This is for bug#974.

rhel extras
There is a new header which adds this "external product" repository to the rpmpath. We should look at whether this provides newer versions of anything useful.

lcfg-yum problem
The MDP users of the yum component found that a /vars directory was being created on their machines. This was caused by them using the wrong schema version, to avoid this happening again the schema is now included in the lcfg-defaults.rpms package list.

Toby needed to build a 32bit version of the kerberos package for SL7 to satisfy dependencies. To make this easier a mock config has been created which allows the building of 32bit packages using the i686 "alternate arch" provided by Centos. This needed some enhancements for the mock components to allow the setting of rpmbuild macros. Stephen will investigate making this available via pkgforge as a non-default option.

php 5.5 and 5.6
Graham and Stephen looked at supporting newer versions of php on SL7 using the standard apache httpd. The modules provided by the software collections appears to work although they depend on a slightly newer version of apache 2.4 which is installed in a non-standard location. This might be useful for other sites, e.g. wordpress installs. The downside is that we don't know how responsive Redhat are to security holes in software collections. We will also have to handle any updates manually. There are new headers - dice/options/apacheconf-php55.h and dice/options/apacheconf-php56.h - only one version of php5 can be used at a time, the last header included wins.

There is a new dice/options/debuginfo.h header which includes the debuginfo repository into the rpmpath. This also provides a DICE_OPTIONS_DEBUGINFO_KERNEL macro which can be used to add the correct debuginfo packages for the current DICE kernel into the packages list for a profile.

The runtime for systemtap will now be installed on all SL7 machines so that systemtap modules can be loaded into the kernel. Machines with devel packages will also have the client and devel packages for systemtap which allows the building of modules. This makes SL7 the same as SL6 which was setup that way a few years ago when we had a bad kernel security issue.


Dirty Cow
We have been dealing with this kernel security hole.

Some students found a problem with the behaviour of dm-tool which is provided as part of lightdm. See RT#79906 and RT#79870 for details. Upgrading to the latest version of lightdm from epel has fixed the problem.

There are no longer any SL7 machines with named problems so we can consider this problem resolved.

IBM array
Alastair has tidied the volumes and ACLs on the IBM array. It looks like we only have one machine still using this for storage so maybe it can be retired next year?

lcfg defaults
There are lots of lcfg defaults packages missing from SL7. Probably many of them just need an update to the version in the lcfg-defaults.rpms package list.

We should look at the latest version.

SL6 install problems
There were a couple of problems with installing SL6 machines. A mistake was made when the PXE service was upgraded which left the symlink for the x86_64 installroot pointing to the 32bit version. There were also many diceinstallbase profiles missing for SL6.8 testing and stable profiles. Maybe that part of the minor-platform update process should be scripted to avoid a repeat?

SL6 32bit
We still need it for RAT licence serving. However at some point they'll have to move that to SL7 and another solution will be found then.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Continue work on RESTful API - InvProjectRESTapi
      • Start work on final report!
    • Remove default pool if ops meeting agrees
    • Deploy encrypted /tmp and swap conversion script
      • Deploy on office desktops September 7th/8th
      • Need to warn users that Gnome3 may pop up a window about /tmp being full (when script is run)
    • Schedule MPU meeting to discuss systemd ordering
    • package up ILW stuff and document process
    • submit polkit bug to redhat - with Stephen
    • Once Stephen updated DNS part, submit SL7 server base project to August devel meeting for closingAwaiting Tim checking all boxes have been ticked
    • Look at MPUActivitiesList
    • MPU SL7
      • Chase Toby again about testing latest perl-Moose under prometheus (and then make live) after October 1
      • Continue with (discuss cosign with Toby and Neil)
        • Now ready for upgrade
      • Look at IBM DS3524 monitoring
        • Is there any point in doing this if we're going to decommission the array before SL6 finally disappears?
    • Check sysmans (et al) have 'nograce'.
    • Take a look at RT #78875
    • Produce list of missing defaults files (for SL7)
      • first pass - make sure that lcfg-defaults.rpms references latest versions
    • Add systemtap mechanism to activities list
    • Propose a live header for office desktop

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
      • Try REST API
    • pkgsearch for SL7
      • reimplement as a yum web front end (yum search for keyword produce an html file of links to cgi to do yum info)
      • Need support multiple platforms
    • MPU SL7
      • Emphasise architectural differences between the new kvm servers in the documentation
      • Move guests from jubilee and hammersmith onto new kvm servers - we can use the disks in the other KVM servers to assist in upgrades
    • Look at MPUActivitiesList
    • Check with RAT whether we still need SL6 32bit - Add reasons to minutes
    • Look to see if there's a Dell R series server which has the same CPU as 'muro'
      • Iain has an R330 with the same CPU. Check result of running LCFG slave on this
    • Roll out fixed sleep code
    • Any remaining work with deploying 'dsu'
    • Release new Virtual DICE image
    • Chris update spending plan
    • Consider spending plan
    • Reschedule MPU futures meeting

  • Stephen
    • Inventory project
      • Try REST api
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • LCFG client refactor stage 2
      • testing and documentation
      • blog article (once documentation complete)
    • Investigate kernel component pipe moan by using shell commands instead of RPM module => waiting on 7.2 => activities list
    • LCFG server symlink to exam branches - produce reporting script and discuss with Graham
    • Circulate dmesg proposal
    • submit polkit bug to redhat - with Alastair
    • SL7 MPU
      • continue work with buzzsaw
      • continue looking at LCFG slaves (including DIY DICE)
    • Work on RT tickets
    • Look at MPUActivitiesList
    • Check hardware model headers to make sure all models support new network naming scheme for SL7
    • Consider spending plan
    • Look at whether chrootkit is still maintained, and if so update

