MPU Meeting Wednesday 21st November 2018

Inventory

No activity.

Virtual desktop

No activity.

LCFG profile security

Stephen has been looking at file permissions - for example of the LCFG status files and the Berkeley DB files. The aim is for them to be owned by root and readable by members of the lcfg group. At the moment various bits of code (the client component, the ngeneric component, perl-ngeneric) all create directories for these files, and the permissions don't quite match up. The solution will be to have the client create these directories - it'll have a subroutine which creates all the standard directories on demand. Shell and Perl components will assume that the directories have already been created. The lcfginit script will call for the directories' creation. This runs early in the boot process, so it'll ensure that there's a sane set of directories. This will be tested for a while before being rolled out.

The logserver is another obvious focus for LCFG profile security. It needs to be rewritten. This is probably enough effort to make a project in its own right, so it won't happen right away. To tide us over until then we may disable parts of the current logserver.

To run qxprof as a normal user you'll need to be in the lcfg group. (However, this is just policy - it could easily be varied from site to site or from machine to machine.)

Misc development

Spectre news:
  • The Spectre report now excludes VMs.
  • We have some hardware which isn't supported by Intel - the Dell Optiplex 780 and older (e.g. 755, 745); the HP 7900 and older; The Dell PowerEdge R200 and other 10th generation models; the Dell PowerEdge 1950 and other 9th generation models.
  • User Support will locate and replace all affected desktops - although hopefully some of this work will simply consist of updating the inventory to record that these machines were disposed of years ago!
  • azul will be the last MPU hardware to receive its Spectre fixes.

Operational

  • Stephen has tweaked the config for the ssh daemon. We've never explicitly disabled ssh keys on all machines, but from today's stable release that will be the default setting. SSH keys are bad for security, and using them doesn't get you an AFS token.
  • The latest desktop model, the HP G4 SFF, has arrived.
    • Stephen has created a hardware header for it (dice/options/hp_elitedesk800g4.h) which will be in next week's release.
    • The good news is that sleep seems to be working, although Stephen will continue testing this for the next week or so to make sure.
    • It uses UEFI. Stephen will clone the UEFI settings for the use of Support staff. These settings will be saved in the MPU AFS area.
    • The details are on the LCFG wiki (HP EliteDesk 800 G4). This is linked from the new Supported Hardware page. Details of future models will also be linked from this page, and this will be the new standard place to record this information.
  • Stephen has upgraded gaivota to 7.5. Lots of firmware upgrades were applied at the same time, and this time all of them went on cleanly. He'll upgrade azul tomorrow.
  • The staff.xrdp.inf.ed.ac.uk server waterloo was rebooted to get its Spectre fix.
  • The general xrdp.inf.ed.ac.uk server hammersmith suffered a disk failure. Chris replaced the disk with a spare. He'll go back and swap this for a proper RAID pair to increase reliability.
  • We need to produce a Data Privacy Impact Assessment (DPIA) for new projects or services handling personal data. For details and guidance see:
    • George's mail from 8 November.
    • The DPIA guidance from the University's Records Management section.

Next meeting

Tuesday 4th December at 15:15.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Start work on final report!
      • and give details on how Tartarus tables are accessed to Ian D for inclusion in his privileged access discussion paper
      • Look at postgresql replication (do after shipping)
      • Add tartarus info to SwitchToSelfManaged
      • Need tests for API /orders and need new tests to check for correct authorisation
      • Make lcfg header generation live (need to check what will be deleted when we do this - big discrepancy between old inventory and new)
      • Look at user support form - how does that lookup hostname?
      • Look at whether there is an easy library way for Chris to grab the macaddr of a machine given the hostname
    • Schedule MPU meeting to discuss systemd ordering
    • Take a look at RT #78875
    • Look at /etc/hosts - dns issue (IPV6?)
      • work out what we need to fix current problem
    • Implement change to kvmtool to allow KVMs to be marked as disabled
      • looked at this - looks like the metadata tag isn't passed through libvirt (prior to 4.0.0), so can't be read/written by kvmtool
      • put on activities list to do once upgrade to libvirt-4.0.0
    • Look at Stephen's 'Thoughts on shell components'
    • Start looking at https and computing.help (remove assumption that https means want cosign login)
      • wait on Neil's efforts with EdWeb
    • Investigate systemd reboot bug on gaivota and add some more debugging (store tree diff somewhere)
    • drupal username collection re GDPR
      • configure live server to run the user expiry script
      • Fixup email domains for existing accounts and check fix for domain setting to inf.ed.ac.uk is in place on live service
      • need to ship fixed cosign module on live service
    • Inventory stuff re GDPR
    • Check with Tim / George about capability for login to student machines - where are we
      • Tim says that we should create a capability that is given to the base cohort and set that capability to no-grace
    • Useful? - a script which checks how fast a machine's console log is growing (eg huge number of dbus problems on hammersmith)
      • suggest to Ian D
    • Blog on projects
    • KVM pcid
      • Investigate spectre / meltdown wrt VMs
      • Which CPU is needed for each group..
Following config worked on 'brent' (hosted on vermelha). We might need to consider whether we want "match='exact'" wrt migrations.
<cpu mode='host-model' match='exact'>
<model fallback='allow'>IvyBridge</model>
<vendor>Intel</vendor>
<feature policy='require' name='pcid' />
</cpu>
      • Update: looked at this. We should be safe to set CPU model to host-model on clusters where the CPU is identical across the cluster (KB and AT). However we can't where the CPU's aren't identical (IF) - here we should be able to set a base minimum machine (SandyBridge ?). We'd need to check that migration works. Recent versions of virsh allow you to specify the hosts in the cluster and ask for a CPU model description which will work across all the cluster. Setting the base minimum to SandyBridge on 'oyster' fixed one of the Spectre flaws, but not all. It looks like we need a more up-to-date qemu-kvm to fix all the remaining flaws.
    • Remove IBM disk array from stack
    • Review https://computing.help.inf.ed.ac.uk/using-ssh-windows
    • Produce some notes from OSS
    • Ask Alison to decommission the dc7900 student cluster
    • Read George's mail of 8th November wrt DPIA

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • Look at MPUActivitiesList
    • Look at RT
    • Continue work on SL7 coordination final project report (currently pending other units completing)
    • User training materials project #403
    • Continue with RT ticket clearout as discussed in October
    • Add two more disks to hammersmith to create a RAID pair for /tmp
    • Read George's mail of 8th November wrt DPIA

  • Stephen
    • submit polkit bug to redhat - with Alastair (still exists under 7.3)
    • Produce some text for systemd mount bug (to submit to RH)
    • Take issue of disable per user journald logs on certain servers to OPS
    • Consider PD work for after LCFG client ...
      • looking at Ceph
    • Look at MPUActivitiesList
    • Look at where we're using ALL in access.conf
    • Finish off NX replacement project (#389)
      • produce final report
    • Continue with RT ticket clearout as discussed in October
    • Produce plan for upgrading Forum KVM servers to SL7.5 (Stephen and Alastair to do)
    • Review https://computing.help.inf.ed.ac.uk/self-managed-security
    • Read George's mail of 8th November wrt DPIA

-- AlastairScobie - 21 Nov 2018

Topic revision: r9 - 23 Sep 2019 - 13:33:41 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies