MPU Meeting Tuesday 27th September 2016

Inventory

No recent activity. There haven't been any responses to the REST API. Please try it out and let Alastair know how you get on.

LCFG Client Refactoring

Stephen has automated the generation of documentation (using Doxygen). He's been testing the conversion of profile data from XML to Berkeley DB format. As ever this has (reassuringly) exposed one or two bugs, which have been fixed.

MPU SL7

  • The two new KVM servers gaivota and girassol have now been hosting production VMs for several weeks. We've been using them ourselves for test VMs. We haven't noticed any problems. We're therefore going to go ahead and upgrade the rest of the KVM servers to SL7. We'll start by emptying jubilee and hammersmith of VMs (these two servers are moving on to other duties). We'll then upgrade the other two Forum-based servers (oyster and azul) then tackle the AT-based and JCMB-based KVM servers. All VMs will be migrated off each server before it's upgraded, so there should be little or no break in service.
  • The two PXE and package cache squid servers have been upgraded to SL7.
  • The two PackageForge builders are now SL7. At the moment they're both virtual machines. This has improved performance because the former builders were superannuated desktops. Stephen has been putting effort into updating the software to allow the PackageForge master builder to run SL7. This has touched on DBIx::Class and Catalyst, amongst other things. The package data is now in JSON format in the database, rather than in YAML in the filesystem. This opens the way for a future upgrade to the PackageForge web interface to provide all the data related to each build more conveniently and intelligently. He's also updating it for PostgreSQL 9.6. Meanwhile the web interface should be noticeably more responsive after the SL7 upgrade.

Miscellaneous Development

  • Stephen put a lot of effort into getting ethernet over USB working with Lego Mindstorms. As a result he now knows a lot more about udev.
  • DSU seems to be a success - it works and it's easy to use. Thanks to everyone who tried it out. There's some further work to do on its LCFG.
  • We wondered whether we needed to keep 32bit SL6 going. Yes, we do. RAT needs it for as long as SL6 is available.

Operational

  • Our new desktops are having trouble sleeping. We expect to be able to fix the first two of these problems by modifying the sleep component, but for the third problem it seems that we'll just have to wait for a more functional version of the kernel:
    • The new CDT Lenovo desktops are randomly kept awake by their mice;
    • They also exposed a potentially serious bug in the sleep component (calling IPC::Run without specifying a timeout can make the component hang indefinitely if the command it's calling hangs - and since the component runs every three minutes, copies of the component can quickly mount up and overwhelm a machine's resources);
    • The HP G2s seem incapable of properly resuming their DisplayPort displays after sleep. Rumour has it that code in the Linux kernel related to their Intel SkyLake CPUs is still buggy and not really fit for use yet.
  • There was a problem with the way the Support Form consulted the inventory - this had to be brought up to date. It could be further improved using a REST API query.
  • We urgently need to work on the 2016-19 MPU spending plan.

This Week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • Document clientreport (eg how to add modules)
      • Document order sync code
      • Document hpreport processing script
      • Continue work on RESTful API - TartarusRESTAPI
      • Start work on final report!
    • Remove default pool if ops meeting agrees
    • Deploy encrypted /tmp and swap conversion script
      • Deploy on office desktops September 7th/8th
      • Need to warn users that Gnome3 may pop up a window about /tmp being full (when script is run)
    • Schedule MPU meeting to discuss systemd ordering
    • Continue building computing.help honeypotNo point until we have a replacement scanning service
    • package up ILW stuff and document process
    • submit polkit bug to redhat - with Stephen
    • *After* next kernel update - Run named existence report on bandama No machines with duff named reported.
    • Once Stephen updated DNS part, submit SL7 server base project to August devel meeting for closingAwaiting Tim checking all boxes have been ticked
    • Look at MPUActivitiesList
    • MPU SL7
      • New package slave export server (jornets)- go live. Do first thing of a morning.(Kenny does a sync every hour)
      • Decommission otter (leave until > October 3rd in case of problems)
      • Chase Toby again about testing latest perl-Moose under prometheus (and then make live) after October 1
    • Check sysmans (et al) have 'nograce'.
    • Review 'ssh on windows' documentation page
    • Cost out an el-cheapo R230 server with same CPU as in 'muro'An R230 with 16GB, 1 PSU, 1TB hot-plug disk (no RAID) would cost 1000 inc VAT (list price)
    • Record somewhere that the support form makes queries of the inventory (to determine any DICE host allocated to the user)Added as a project deliverable.
    • Take a look at RT #78875
    • Produce list of missing defaults files (for SL7)
    • Tidy up volumes and volume access on IBM array

  • Chris
    • Inventory project
      • Continue work on clientreport modules for replacing firmwarereport
      • Try REST API
    • pkgsearch for SL7
      • reimplement as a yum web front end (yum search for keyword produce an html file of links to cgi to do yum info)
      • Need support multiple platforms
    • MPU SL7
      • Emphasise architectural differences between the new kvm servers in the documentation
      • Move guests from jubilee and hammersmith onto new kvm servers - we can use the disks in the other KVM servers to assist in upgrades
    • Look at MPUActivitiesList
    • Check with RAT whether we still need SL6 32bitAdd reasons to minutes
    • Look to see if there's a Dell R series server which has the same CPU as 'muro'
      • Iain has an R330 with the same CPU. Check result of running LCFG slave on this
    • Roll out fixed sleep code
    • Any remaining work with deploying 'dsu'
    • Continue work on updating virtualDICE image
    • Chris update spending plan
    • Consider spending plan
    • Reschedule MPU futures meeting

  • Stephen
    • Inventory project
      • Try REST api
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • LCFG client refactor stage 2
      • testing and documentation
      • blog article (once documentation complete)
    • Investigate kernel component pipe moan by using shell commands instead of RPM module => waiting on 7.2 => activities list
    • LCFG server symlink to exam branches - produce reporting script and discuss with Graham
    • Circulate dmesg proposal
    • submit polkit bug to redhat - with Alastair
    • SL7 MPU
      • Continue work on pkgforge master
    • Work on RT tickets
    • Add something about DNS to FinalProjectReport356
    • Look at MPUActivitiesList
    • Check hardware model headers to make sure all models support new network naming scheme for SL7
    • Look at deploying dice/options/postgresql-backup-check.h
    • Consider spending plan

-- AlastairScobie - 27 Sep 2016

Topic revision: r14 - 24 Sep 2019 - 13:50:23 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies