MPU meeting Wednesday 7th February 2018

Inventory

It now works with IPv6. See the Operational section for more detail.

LCFG Client Refactoring

Nothing this week.

User Security Training

Chris has been putting some thoughts together which he'll circulate.

Virtual Desktop

Since SEE has already put together a remote desktop solution which works well and is freely available, there seemed little point in trying other possible solutions; so a small test version of SEE's RDP-based remote desktop service has been set up here in Informatics. It uses HAProxy as a load-balancing front end. In the trial service at xrdp.inf.ed.ac.uk, the front end routes connection requests to the less loaded of the two backend machines, or to an existing session if there is one. The test service will be open to computing staff. It's envisaged that the full service will use real hardware rather than virtual, and that there will be separate staff and student services. The next work on this project will be to add some Informatics identity to the login screen; to change access controls on the PAM stack for the backend machines, to permit differences between staff and student services; and to document the services on computing.help.

See XRDPService for more details.

Miscellaneous Development

We can now distribute Virtual DICE images via rsync. This can be substantially faster than copying them from AFS, at least once the image has been cached in the rsync server's AFS cache. The rsync source for the current image is virtualdice.inf.ed.ac.uk::virtualdice/vdice.*

Both Geosciences and Physics have had corruption problems on their LCFG slaves. This may have been connected to the LCFG server not starting up at the best place in the boot sequence. It now starts after the stable target, so won't start until any multiple boot sequences have successfully completed.

Alastair has applied Kenny's patch to make lcfg-fstab support NVME SSD devices (see Bug:1025). It appears that when mini desktops with an SSD of >= 256GB are ordered through SelectPC they come with NVME SSDs rather than SATA. This is now being raised for us with the SelectPC group.

Stephen found a handy Perl module which reads Apache configurations, and has fashioned a script which parses our Apache configurations. This script's output is then fed into a second new script (courtesy of Neil) which searches for potentially troublesome obsolete 2.2 configuration directives. If we add this to an apacheconf header we'll be able to run it on all of our web servers and find most of the troublesome directives in one go.

Operational

We should sort out the pcid issue before upgrading any more of our KVM servers to 7.4.

Also, extra disks have been ordered for gaivota and girassol so the upgrade of the Forum-based KVM servers should take place after the installation of these disks.

KVM server oyster has been upgraded to SL7.4 and its firmware updated.

PXE installs are slow at KB. We've seen this before, but the issue went away when we tried to debug it. Our plan is to add a PXE server at KB then configure DHCP at KB to use it.

Alastair has made tartarus suitable for use with IPv6. It turned out that using a specific address with a virtual host wouldn't work with IPv6. Instead the virtual host address should be * then it should match on server name.

Machines with NVME need UEFI boot support.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Start work on final report!
      • Consider what else needs done other than docs and tidying and backups
      • Blog something....take dev meeting talks
      • and give details on how Tartarus tables are accessed to Ian D for inclusion in his privileged access discussion paper
      • Look at postgresql replication (do after shipping)
      • make ipv6 changes permanent
      • Add tartarus info to SwitchToSelfManaged
    • Schedule MPU meeting to discuss systemd ordering
    • Check sysmans (et al) have 'nograce'.
    • Take a look at RT #78875
    • Look at /etc/hosts - dns issue (IPV6?)
      • work out what we need to fix current problem
    • Circulate info on RH7.3 systemd changes we may wish to consider
    • RT actions (as agreed)
    • Implement change to kvmtool to allow KVMs to be marked as disabled
    • Look at Stephen's 'Thoughts on shell components'
    • Look at MPUActivitiesList
    • Start looking at https and computing.help (remove assumption that https means want cosign login)
      • wait on Neil's efforts with EdWeb
    • Chase Alison about LCFG check monitoring ( start doing again )
    • Investigate systemd reboot bug on gaivota and add some more debugging (store tree diff somewhere)
    • If in Forum server room, review MPU rack usage
    • Start upgrading MPU servers to 7.4
      • upgrade salamanca - remember to update firmware (Check whether this is needed)
    • Upgrading MPU servers to 7.4
      • NX servers - hammersmith
    • Check that our journald configuration correctly implements our retention policy
      • It doesn't. journalctl shows entries from last year (eg May 17 for jubilee).
      • Possible solution is to set MaxRetentionSec =1month in /etc/systemd/journald.conf - but not convinced setting this on existing machines clears up old per-user journals for non active users
    • Discuss with Neil - drupal username collection re GDPR
    • Inventory stuff re GDPR
    • Look at allowing host based access control to unauthenticated Tartarus API
    • Check with Tim / George about capability for login to student machines - where are we
    • Read Chris's ThoughtsOn403
    • Play with XRDP
      • problem with missing cursor with xterm ?
    • IPV6 remaining computing.help servers

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
    • Look at MPUActivitiesList
    • Look at RT
    • Continue work on SL7 coordination final project report (currently pending other units completing)
    • If in Forum server room, review MPU rack usage
    • libvirt - test for memory leaks (wrt console servers) Ian will test it for memory leaks after the 17 January stable release
    • User training materials project #403
    • Add %slaac to vermelha and amarela and jornets and aegean and nuthatch

  • Stephen
    • LCFG client refactor stage 2
    • RT actions (as agreed)
    • submit polkit bug to redhat - with Alastair (still exists under 7.3)
    • Produce some text for systemd mount bug (to submit to RH)
    • Take issue of disable per user journald logs on certain servers to OPS
    • Schedule jubilee downtime to move to SOL
    • Consider PD work for after LCFG client ...
      • looking at Ceph
    • Look at MPUActivitiesList
    • On metropolitan, find fast baud rate we can drive the real physical consoles. (This so we can decide whether to use physical consoles for KVM servers).
    • Look at where we're using ALL in access.conf
    • If in Forum server room, review MPU rack usage
    • Agree with RAT how software package requests are handled - waiting on Graham documenting
    • Start off NX replacement project (#389)
      • Work on producing a DICE specific login screen
      • Documentation
    • Upgrading MPU servers to 7.4
      • NX servers - jubilee
    • Decommission DL180s in AT previously used Ceph testing
    • Read Chris's ThoughtsOn403
    • Add %slaac to beaver

-- AlastairScobie - 07 Feb 2018

Topic revision: r9 - 23 Sep 2019 - 13:33:40 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies