MPU Meeting Wednesday 9th January 2019

Inventory

No activity.

Virtual desktop

No activity.

LCFG profile security

The security work on the logserver will become a separate project.

The permissions on LCFG data files (e.g. status, db) have been made more secure. This change will need a completely fresh install in order to test it safely. Once that's done, this project will be stalled until deployment.

User security training materials

This project will concentrate on the managers of self-managed servers. The aim will be to make them aware of basic good practice in security. One possibility will be to do this with a Learn course, one which could perhaps be used by people in other parts of the University. Stephen suggested that the project also cover services run from self-managed machines, especially services available outside the Informatics firewall, and it was agreed that this was a good idea.

Misc development

Stephen has been updating the LCFG build tools to add Debian package support. This comes in several parts:
  1. lcfg-reltool gendeb will create a debian directory for a project, containing metadata to make a Debian package.
  2. Once a project with a debian directory has been checked in, buildtools will make a debian source package automatically at future builds. This can then be fed to dbuild on a Debian box to build the package.
  3. More commands will be added to lcfg-reltool to make it possible to drive the build process from a Debian box natively.
autoconf-style macros will not be supported for Debian packaging files.

Operational

  • The security backports to 7.5 are all done. The lab machines have now got them, so all the DICE machines are now up to date.
  • The BIOS updates for our servers are mostly done.
  • Stephen and Alastair found and patched a bug in perl-Parse-DMIDecode which was making the clientreport crash on certain models, including the Dell PowerEdge R440, meaning that they weren't appearing in the inventory. The clientreport script was also improved to make it behave more gracefully in such error situations. The clientreport script now runs correctly on the affected machines and they are now in the inventory.
  • Prompted by this, Stephen and Graham have made a web report which lists clientreport errors. We'd like to thank Graham for his magical guru-level SQL!
  • Chris has been making the next Virtual DICE. (There's a test image in the new directory if you want to try it.) Guest logins are slow - on a test Windows 10 box they take a couple of minutes when the VM has the default 2GB of memory, 1m:20s with 4GB of memory. Graham has obligingly put a short timeout on the ldapsearch commands in the bash defenv scripts, but this hasn't improved the problem. Text-based guest logins on pseudoterminals are far quicker - about a second.
  • We each have several pandemic documents to review and bring up to date, listed in our To Do lists below.

Next meeting

The next meeting, on 16 January, will be dedicated to the MPUActivitiesList and the This Week lists below.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Start work on final report!
      • and give details on how Tartarus tables are accessed to Ian D for inclusion in his privileged access discussion paper
      • Look at postgresql replication (do after shipping)
      • Add tartarus info to SwitchToSelfManaged
      • Need tests for API /orders and need new tests to check for correct authorisation
      • Make lcfg header generation live (need to check what will be deleted when we do this - big discrepancy between old inventory and new)
      • Look at user support form - how does that lookup hostname?
      • Look at whether there is an easy library way for Chris to grab the macaddr of a machine given the hostname
    • Schedule MPU meeting to discuss systemd ordering
    • Take a look at RT #78875
    • Look at /etc/hosts - dns issue (IPV6?)
      • work out what we need to fix current problem
    • Implement change to kvmtool to allow KVMs to be marked as disabled
      • looked at this - looks like the metadata tag isn't passed through libvirt (prior to 4.0.0), so can't be read/written by kvmtool
      • put on activities list to do once upgrade to libvirt-4.0.0
    • Look at Stephen's 'Thoughts on shell components'
    • Start looking at https and computing.help (remove assumption that https means want cosign login)
      • wait on Neil's efforts with EdWeb
    • Investigate systemd reboot bug on gaivota and add some more debugging (store tree diff somewhere)
    • drupal username collection re GDPR
      • configure live server to run the user expiry script
      • Fixup email domains for existing accounts and check fix for domain setting to inf.ed.ac.uk is in place on live service
      • need to ship fixed cosign module on live service
    • Inventory stuff re GDPR
    • Check with Tim / George about capability for login to student machines - where are we
      • Tim says that we should create a capability that is given to the base cohort and set that capability to no-grace
    • Useful? - a script which checks how fast a machine's console log is growing (eg huge number of dbus problems on hammersmith)
      • suggest to Ian D
    • Blog on projects
    • KVM pcid
      • Investigate spectre / meltdown wrt VMs
      • Which CPU is needed for each group..
Following config worked on 'brent' (hosted on vermelha). We might need to consider whether we want "match='exact'" wrt migrations.
<cpu mode='host-model' match='exact'>
<model fallback='allow'>IvyBridge</model>
<vendor>Intel</vendor>
<feature policy='require' name='pcid' />
</cpu>
      • Update: looked at this. We should be safe to set CPU model to host-model on clusters where the CPU is identical across the cluster (KB and AT). However we can't where the CPU's aren't identical (IF) - here we should be able to set a base minimum machine (SandyBridge ?). We'd need to check that migration works. Recent versions of virsh allow you to specify the hosts in the cluster and ask for a CPU model description which will work across all the cluster. Setting the base minimum to SandyBridge on 'oyster' fixed one of the Spectre flaws, but not all. It looks like we need a more up-to-date qemu-kvm to fix all the remaining flaws. * Wait until 7.6ish is settled re KVM software versions and try above again
    • Remove IBM disk array from stack
    • Produce some notes from OSS
    • Read George's mail of 8th November wrt DPIA
    • Update inventory project milestones
    • Try latest VDICE on Windows 10 machine at home (research guest login delays)
    • Update Pandemic pages - computing.help

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • Look at MPUActivitiesList
    • Look at RT
    • Continue work on SL7 coordination final project report (currently pending other units completing)
    • User training materials project #403
    • Continue with RT ticket clearout as discussed in October
    • Add two more disks to hammersmith to create a RAID pair for /tmp
    • Read George's mail of 8th November wrt DPIA
    • Update Pandemic pages - KVM (done: PandemicKVM), XRDP, Releases

  • Stephen
    • submit polkit bug to redhat - with Alastair (still exists under 7.3)
    • Produce some text for systemd mount bug (to submit to RH)
    • Take issue of disable per user journald logs on certain servers to OPS
    • Consider PD work for after LCFG client ...
      • looking at Ceph
    • Look at where we're using ALL in access.conf
    • Finish off NX replacement project (#389)
    • Continue with RT ticket clearout as discussed in October
    • Review https://computing.help.inf.ed.ac.uk/self-managed-security
    • Read George's mail of 8th November wrt DPIA
    • Update project milestones (additional ones to cover deployment)
    • Discuss CUDA10/nVidia driver issues with Iain
    • Produce a project proposal for replacement LCFG logserver
    • Test secure LCFG profile storage wrt. fresh install
    • Firmware update - deneb and steen
    • Reboot staff.ssh (hare)
    • Complete tartarus clientreport module errors report
    • Update Pandemic pages - PXE, Security, LCFG, Package site mirror, ssh service
    • Produce draft spec for XRDP servers

-- AlastairScobie - 09 Jan 2019

Topic revision: r13 - 23 Sep 2019 - 13:33:41 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies