MPU Meeting Tuesday 13th January

lcfg-systemd

Alastair noticed that config which delays the start of a service was proving to be quite popular, so he's created the LCFG_SYSTEMD_WAIT macro to do this. It's documented in the LCFG systemd docs and in the LCFG systemd cookbook.

In December he also wrote several blog entries on systemd in the SL7 port diary:

LCFG on SL7

The sleep component
The development code is now working! It still needs to be packaged up and installed.
autofs
Stephen and Craig have it configured and installed on SL7. Kenny MacDonald's mdp headers form the basis of the lcfg level headers, and there's a new dice level header.
Package forge
Autobuild is now turned on for EL7, so unless you specify otherwise, builds will be attempted for EL7, SL6 32bit and SL6 64 bit.
IS package buckets
IS now has el7 package buckets, and we're now mirroring them.
DICE level MPU config
Stephen has been moving MPU configuration into the dice level. The PAM config was the usual fiendish pain but is probably now OK at both dice and lcfg levels.
DICE install
Stephen is working on a DICE level install and hopes to have it finished soon.
dicehacks.h
The contents have been merged into the dice level. (If you spot a bit of config that's now in the wrong place, feel free to move it.)
wallet
Stephen has realised that (for usability) SL7 will need wallet working before letting users on to it.
Display manager and screen lock
Alastair has blogged (Which display manager to use for EL7?) about our display manager problem. The latest LightDM contains a screenlock solution but it doesn't build on our SL7 machines, so Alastair then tried the venerable xscreensaver, adding PAM configuration for it. This works, but it seems awkward to set reasonable defaults for its behaviour - we would need either a component or to edit our default configuration choices into our own version of the RPM. (We favour a blank screen and enabling DPMS, to save power.)
DICE progress meeting
We held a meeting to assess the state of readiness of the DICE SL7 desktop, identify work to do and set target deadlines. Here are the notes from the meeting.

Miscellaneous Development

Software collections
Software collections are now installed by default on SL6. To our surprise we've found that they seem poorly supported on 32bit, with some packages missing or conflicting. We're patching up the problems as we come to them, and it's worth noting that we have very few 32bit machines left.
Replacing boot.run
Stephen has been thinking and blogging about how we might replace boot.run in the world of systemd (Replacing "boot.run"). He points out that we have the opportunity to provide something better. Please read the blog article and give him your ideas and thoughts!

Operational

IBM storage array glitch
Alastair described this problem in a blog post (Disruption to some services 6th January). IBM has suggested upgrading the firmware, and we plan to do this soon. The IBM engineer tells us that controller updates can be done while the array is running. (The array is also meant to carry on running when a disk fails, rather than locking up.) Finding out how to report the problem to IBM took more time than getting the fix from them. Alastair has added advice on fault reporting to our IBM DS3524 wiki page.
staff.nx.inf.ed.ac.uk
It's now on northern. central has been turned off and will be scrapped. We now have the two ex-_northern_ disks but it's no longer clear which is the failed one; we'll test them on metropolitan.
oyster
The KVM server oyster has been in an odd state since last week's IBM array glitch. It hasn't syslogged anything since the glitch, and more recently we've stopped receiving mail from it. We suspect some temporary kernel memory corruption, so we'll reboot it on Wednesday (14 January) evening to clear the problem.
LVM configuration
The LVM configuration on the KVM server jubilee got messed up, though not to the detriment of its VMs. Alastair rescued the situation and has updated our notes on creating a storage pool on a KVM server.

This Week

  • Alastair
    • systemd project
      • Consider how components will work with systemd
      • Continue work on documentation - guidance for other COs on how to use
      • Look at getting Stop method to rebuild /etc/systemd regardless of whether there have been resource changes (remember Stop doesn't call Configure)
      • convert to module
    • EL7 project
      • what sort of level of space is required by systemd journald logging (for desktop /var sizing)
        • (By default journald logs to /run/log. Have to mkdir /var/log/journal to keep data). Have enabled on one machine
        • identify default retention policyDefault retention is to use up to 10% of partition. Can use either space or time as a constraint on space. Logs are per user + system, so users can read their own data. Each log file starts at 8MB, so a popular machine will have lots of log data.
        • Blog about journald retention policy - and document how to set...
        • Blog about decision to keep journald and /var/lcfg/log/syslog duplication - and resulting configuration change.
      • check installroot stuff same version across SL6 and EL7
        • and pull out old SL5 stuff
      • Look at whether we need anything better than existing network component for desktops Don't think so. Virtualbox works fine with current config
      • Look at lightdm issues
        • power management
      • Look at LCFG bug #799 (systemd buffered output)See bug entry for details.
      • convert lcfg-dconf and lcfg-lightdm to module
      • Take to CEG - DICE EL7 by 1st Feb, COs desktops in February, guinea pigs by 1st March.
      • apply sensible defaults to xscreensaver rpm - blankscreen and DPMS enabled
    • RT 65774 - try two identical monitors on my machine
    • Need to remove default bridge from kvmtool create
    • Think about disk partition policy
    • Review last reviewed date for documentation
    • Consider more cores as default for KVM guests
    • Is there a way of disabling debugging information being displayed by drupal when there are problems?Can't see how to do safely (needs disable backtrace in /etc/php.ini?)- Ask David Marsh in Physics?
    • Read LISA notes
    • Look at KVM server loading
    • At some point - look at installroot kdcregister solution
    • Schedule firmware upgrade for DS3254
    • Read Stephen's blog article on boot.run functionality replacement
    • Check scans
    • Spec up new DR server and circulate

  • Chris
    • EL7
      • finish off Sleep component
      • investigate Gnome power-management and document
    • url shortener (once gdm solved)
    • Create Project entries - for KVM refinement project
    • Experiment rename br0 as br33 on metropolitan
    • Think about disk partition policy
    • Review last reviewed date for documentation
    • Identify VMs to move to waterloo and move to balance load
    • RT 69276, 61762
    • Reboot oyster, out of hours

  • Stephen
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • EL7
      • Minutes of SL7 meeting (Alastair forward his notes first)
      • Continue thinking about boot.run functionality
      • Complete porting MPU managed resources to the DICE level
        • wallet
        • DNS
      • Finish working towards a stable EL7 release
    • Test northern's SAS disks in metropolitan
    • Junk central
    • Think about PD - Interested in ZeroMQ
    • Think about disk partition policy
    • Review last reviewed date for documentation
    • Add extra memory to waterloo (and if those work, order up more memory for hammersmith)

-- AlastairScobie - 13 Jan 2015

Topic revision: r9 - 19 Jan 2015 - 10:15:25 - StephenQuinney
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies