MPU Meeting Tuesday 10th November 2015

Inventory

  • The testing suite has now been brought together and tidied up.
  • Alastair has been looking at regenerating the order files from the inventory data. This is inevitably a lossy process so this won't be run automatically. It doesn't run when code is changed, but it will be run when the test suite is changed.
  • Alastair's blog has a November 2015 status report on the inventory project.

LCFG client refactoring

Stephen has been looking at context handling.

SL7 on servers

There's a problem with FibreChannel. /dev/mapper should contain a fixed managed symlink to each disk device and each partition on them. (This is useful since physical volume IDs change on reboot.) On DICE SL7 the partition links don't appear when they should. A flush after boot creates them, though. Alastair has tried removing LVM, but that makes no difference to the problem. He suspects a timing issue or an interaction between multipath and udev.

Miscellaneous development

Encrypted partitions
Extra testing showed that the new fstab and hackparts code didn't work as expected. The problem has been fixed and the new code is in this week's release. The encryption of swap and /tmp will come in next week's release for new SL7 installs. Alastair is pondering how best to introduce it to existing installs.
Bug:904 Macros for inifile component
Stephen has moved some macros and useful stuff from dice and mdp level headers to the lcfg level.
Bug:898 Changes needed for freenx on sl7
Stephen applied a patch from Matthew Richardson which makes FreeNX work on SL7 in SEE.
Bug:892 lcfg-auditd fails to start properly on sl7
Barry O'Rourke discovered and reported the issue and Stephen has fixed it.
Bug:910 Can't stop a service daemon with stop_on_remove_$
Chris reported this bug in lcfg-systemd.
nautilus-open-terminal removed
by Stephen as it was causing problems.
Build Tools support for OS X El Capitan
Stephen has been looking at Build Tools (and more general LCFG) support for El Capitan. Its new "System Integrity Protection" feature makes /usr read-only for root, so LCFG must switch to using /usr/local. It's worth noting that the OS X linker doesn't look in /usr/local/lib, making life that bit more difficult for open source software developers.
CMake and Tcl on SL7
Tcl's CMake macros wrongly expand @TCL_TCLSH@ to /bin/tclsh on SL7. The value should be /usr/bin/tclsh. This means that RPMs with Tcl scripts have an unfulfillable requirement. You can get round this by defining the right value in CMakeLists.txt as described in Chris's blog post.

Operational

/var/nx on nx.inf.ed.ac.uk filled up
This partition tends to fill up when the nx server gets busy and users leave old abandoned sessions lying about. Free up space by identifying and killing abandoned sessions. Chris has documented how to do this at NxAdmin#When_var_nx_is_100_full.
xscreensaver freezing
xscreensaver has been freezing on a few SL7 machines (e.g. RT:74709, RT:74687). Some ideas for diagnosing the problem: It would be good to get an affected machine left untouched so that we could log in to it remotely and find out what was going on. Also, who does it happen to? Just people who have been here years and have old .xscreensaver files? (At the next day's Operational meeting it was suggested that this was almost certainly an autofs problem, and that the autofs problems should be resolved soon.)
Software Collection headers
Chris asked how best to make/discover package lists when making new scl- headers. It's best done using yum, and Stephen will make his Software Collections yum repo configuration more widely available.
SL7 accessibility warnings
These should disappear this week from SL7 machines. There are more details in Chris's blog post.
An almost failed disk
One of the disks on KVM server vermelha almost failed; this was flagged up by the hwmon check; and the disk was replaced by Dell. Chris has added the "nearly failed" case to our replacing a failed or failing disk howto doc.

This week

  • Alastair
    • Inventory project
      • continue working through TartarusWorkFlow
      • consider what next can be integrated into existing system, if anything
      • Write something to parse the JSON clientreport blobs and look for any major problems.
      • Document clientreport
      • Document order sync code
    • @home - look at using rsync from site.pkgs instead of mirroring from upstream
    • Remove default pool if ops meeting agrees
    • Experiment with different window managers under VNC (making the assumption that performance under NX will be similar)
    • Think of a use for 'atom'
    • Understand how NetworkManager works wrt init scripts
    • Deploy encrypted /tmp and swap
      • Add config to encrypt /tmp and swap on new desktops
      • Continue work on script to modify existing machines
        • modify to be an installroot script
        • modify to wipe swap and /tmp
    • Look at RT tickets to close
    • SL7 base server
      • localhome - mark as deferred until we have /home as symlink again (and then perhaps use pam_mkhomedir)
      • check metropolitan USB and CD
      • Continue work with FC and LVM
        • investigate interaction between multipath and UDEV
        • check nagios notices if FC cable removed
    • Look at Reminders

  • Chris
    • Inventory project
      • continue working through TartarusWorkFlow
      • Look at clientreport modules for replacing firmwarereport
    • pkgsearch for SL7
      • reimplement as a yum web front end (yum search for keyword produce an html file of links to cgi to do yum info)
      • Need support multiple platforms
    • Liaise with George over iDRAC documentation
    • SL7 -
      • hwmon (HP raid and H200 raid)
      • Continue testing DL180 (try otaka)
      • Finish off looking at R620 (belter)
    • RT tickets close
    • Create an MPU blog
      • create a couple of SL7 server blog articles

  • Stephen
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • LCFG client refactor stage 2
      • document API
      • complete combining packages
      • blog article
    • Think about PD - Interested in ZeroMQ
    • Write up how WM switchdesk mechanism works
    • RT tickets close
    • Investigate kernel component pipe moan by using shell commands instead of RPM module
    • Look at George's lcfg-dns proposal
    • Discuss reworking of wire headers with George
    • Look at Reminders
    • Look at yum config for software collections

-- AlastairScobie - 10 Nov 2015

Topic revision: r8 - 23 Sep 2019 - 13:33:38 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies