MPU Meeting Wednesday 1st August 2018

Inventory

  • The hosts' inventory headers are now synching correctly to the LCFG master thanks to Stephen.
  • Re the problem of machines with multiple serial numbers - permitting multiple serial numbers to be set for a machine seems to lead to a lot of implementation complication. It seems far simpler to add an extra column named "altserial_no" and alter the serial number search to search that column and the usual serial_no column; so that's what's been done.
    • The serial number which a machine returns on a DMI query will be the one used in the orders file; the alternative "supplier" serial number then becomes the "altserial_no".
    • Alastair will add a new procurement condition that servers should be able to return their serial number via a DMI query.
  • There was a bug in the command line options. The ii command was claiming success and exiting immediately when options with two dashes (--) were mixed with options with one dash (-), since the latter were interpreted as a single letter option with an argument. The code now checks its arguments for possible mistakes of this form.
  • We still want to be able to permanently override the network autolocation data with the manually set location.

Virtual Desktop

  • The staff server will be upgraded from NX to RDP early next week.
  • Alastair has noticed that sessions on metropolitan seem noticeably slower than on the test VMs - but then, metropolitan is nine years old. He has also noticed that slow performance seems to be related to latency rather than throughput. We agreed that for the software which still works with the ancient NX libraries it is faster than with RDP - but we hope to substantially speed up the user experience with the introduction of new hardware.

LCFG Profile Security

A few headers have been tidied, and Stephen fixed an ordering issue related to kdcregister.

UEFI Boot

A new project should be created for UEFI bootable media. The current install ISOs are only bootable in legacy mode, but then they're only ever used on virtual machines, which can only boot in legacy mode anyway. This project is otherwise ready to be written up. We're likely to have machines soon which have to be booted in UEFI mode - machines for the AV stands with NVME SSDs.

SL 7.5

Stephen has updated the software collections which are on a standard DICE desktop. The main news is the removal of devtoolset6 (so version 7 is now the only version of gcc) and of python35 (we have python36 and the standard DICE version will be 3.4). The upgrade to 7.5 on stable machines will happen on Monday 20th August.

Misc Development

There's been a major reorganisation of python packages on DICE. Two versions are available:
  • Python 3.6 is available via the python36 software collection. It has the core of Python 3.6 but no additional modules are yet available. We hope that they will become available in time.
  • The standard version of Python on DICE is 3.4. This has a full ecosystem of additional modules. We note that Python 3.4 now has a limited life, and that for instance some core scientific Python modules are expected to drop support for 3.4 some time in the next year. But for the moment, our Python 3.4 environment is our first full Python environment.

Operational

  • We have dropped support for SL6. This means that:
    • no more security updates will be added via LCFG.
    • no more SL6 install ISOs will be built.
    • there is no longer an sl6_64 build host.
    • the virtual machines used to test sl6_64 will be binned.
    • the sl6_64 option on Package Forge will be retained for the time being.
    • sl6_64 will be dropped from the LCFG web site.
  • VirtualBox has been updated to 5.2.16, and VirtualBox guest additions has also been updated.
  • We note that a new Virtual DICE will be needed soon.
  • The KB power down went well.

Next meeting

The next meeting (on Wednesday 8th August) will be devoted to completing the 2018-21 plan for MPU server purchases.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Start work on final report!
      • Consider what else needs done other than docs and tidying and backups
      • Blog something....take dev meeting talks
      • and give details on how Tartarus tables are accessed to Ian D for inclusion in his privileged access discussion paper
      • Look at postgresql replication (do after shipping)
      • Add tartarus info to SwitchToSelfManaged
      • Complete removal of non authenticated access to API and web
      • Need tests for API /orders and need new tests to check for correct authorisation
      • Make lcfg header generation live (need to check what will be deleted when we do this - big discrepancy between old inventory and new)
      • Look at user support form - how does that lookup hostname?
      • Produce a python library to provide people with a programmatic equivalent of ii query
      • Look at whether there is an easy library way for Chris to grab the macaddr of a machine given the hostname
    • Schedule MPU meeting to discuss systemd ordering
    • Take a look at RT #78875
    • Look at /etc/hosts - dns issue (IPV6?)
      • work out what we need to fix current problem
    • Circulate info on RH7.3 systemd changes we may wish to consider
    • RT actions (as agreed)
    • Implement change to kvmtool to allow KVMs to be marked as disabled
      • looked at this - looks like the metadata tag isn't passed through libvirt (prior to 4.0.0), so can't be read/written by kvmtool
      • put on activities list to do once upgrade to libvirt-4.0.0
    • Look at Stephen's 'Thoughts on shell components'
    • Look at MPUActivitiesList
    • Start looking at https and computing.help (remove assumption that https means want cosign login)
      • wait on Neil's efforts with EdWeb
    • Chase Alison about LCFG check monitoring ( start doing again )
    • Investigate systemd reboot bug on gaivota and add some more debugging (store tree diff somewhere)
    • Report on this at next ops meeting that have changed journald configuration (MPU report)
    • Discuss with Neil - drupal username collection re GDPR
      • write a script to remove users who haven't used computing.help in, say 30 days (except COs) - and fix the email address issue (currently defaults to umich.edu)
      • Ask George whether this is covered by legitimate business interest.
    • Inventory stuff re GDPR
    • Check with Tim / George about capability for login to student machines - where are we
    • Add %slaac to hulp and lagun after 21/02/18
    • Useful? - a script which checks how fast a machine's console log is growing (eg huge number of dbus problems on hammersmith)
      • suggest to Ian D
    • Blog on projects
    • KVM pcid
      • Created MPUSpectreMeltdown
      • Put detection script somewhere for people to use
      • Which CPU is needed for each group..
Following config worked on 'brent' (hosted on vermelha). We might need to consider whether we want "match='exact'" wrt migrations.
<cpu mode='host-model' match='exact'>
<model fallback='allow'>IvyBridge</model>
<vendor>Intel</vendor>
<feature policy='require' name='pcid' />
</cpu>
    • Look at why kvmtool doesn't work on circle (running libvirt 4.0.0)
    • Read and comment on Stephen's notes on the LCFG security project
    • Remove IBM disk array from stack
    • Read Chris's blog on ThoughtsOn403
    • Look at moving stuff from the immediate todo back to the main Todo list and then we can prioritise that list
    • Think about spending
    • Look through the entitlements / no grace period issue
      • look through access.conf and work out how the entitlements are constructed

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • Look at MPUActivitiesList
    • Look at RT
    • Continue work on SL7 coordination final project report (currently pending other units completing)
    • User training materials project #403
    • Think about spending

  • Stephen
    • RT actions (as agreed)
    • submit polkit bug to redhat - with Alastair (still exists under 7.3)
    • Produce some text for systemd mount bug (to submit to RH)
    • Take issue of disable per user journald logs on certain servers to OPS
    • Consider PD work for after LCFG client ...
      • looking at Ceph
    • Look at MPUActivitiesList
    • On metropolitan, find fastest baud rate we can drive the real physical consoles. (This so we can decide whether to use physical consoles for KVM servers). - 115200 seems fine.
    • Look at where we're using ALL in access.conf
    • Agree with RAT how software package requests are handled - waiting on Graham documenting
    • Finish off NX replacement project (#389)
      • Fix the keyboard mapping issue
      • Roll out
    • Close off UEFI project - MPU410FinalReport
    • Think about spending

-- AlastairScobie - 01 Aug 2018

Topic revision: r6 - 23 Sep 2019 - 13:33:41 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies