MPU Meeting Tuesday 5th August 2014

Virtual DICE

Stephen has read the draft final report and suggests that it tries to explain the difference between the initial effort estimate and the final total, so we can try to get better at the initial estimates.

LCFG Client Refactoring

Stalled.

systemd

No work on the systemd component this week.

SL7 LCFG port

  • Alastair has introduced an updaterpms cron job so we're now getting regular package updates.
  • There's been a performance problem with updaterpms on EL7 - it's been very slow in the flagging / conflict resolution phase at install time. Alastair has done some debugging. He added timestamps to the updaterpms debug output and found that each package header file request using libcurl was taking 170-180ms instead of 20ms on SL6. It seems that a process is being forked to do the DNS lookup, and the delay happens between the completion of that DNS lookup and when the process is polled for the result: the polling isn't happening soon enough. He mailed the curl mailing list. It turns out that we're seeing an unfortunate interaction between new (post-SL6) asynchronous DNS code in curl, and Red Hat's decision to turn on multi-threaded DNS resolution with curl (to link curl to the multi-threaded version of the resolver). Recompiling curl using single threaded DNS resolution solves the updaterpms performance problem. It would be interesting to know what else on EL7 is using the multi-threaded DNS resolver, but we're not sure how to find this out. Red Hat is now taking an interest.
  • We could speed up updaterpms by changing the order in which we check header file locations. Currently we check for dot files first then for a separate hdrs directory, but these days almost all of our package buckets use a hdrs directory, so we should check for that first.
  • There's a new (post-SL6) curl option to get all of our files through one TCP connection rather than making a separate connection for each one. Would it work through our squid cache for instance? We should try it out.
  • Chris got the package conflicts out of the desktop package list so we can now use the desktop.h header. We've tried this and it seems to work.
  • We've updated the LCFG on SL7 project plan. We don't expect the differences between EL7 and SL7 to amount to more than some package version changes so we're using the same plan for both.

Miscellaneous Development

Stephen has got SL6.5 up and running.

  • He used yummy to generate the package lists. Doing this removes most of the manual work involved in generating the lists. Handily, it pulls in any new dependencies which have appeared since last time.
  • Installs have not yet been tested, although the install package lists have been tested and they're free of conflicts.
  • The installroot uses the very latest SL6.5 kernel so it should be good for a while.
  • Stephen will make the usual table of version changes. Notable changes are Libre Office going from 3.4 to 4.0 and a big ipmitool update. Notable non-changes are the kernel, X, glibc, and prominent applications such as firefox, so we're hoping for little negative impact on our RAT package lists.
  • We could usefully start using 6.5 on our develop desktops now to help test it.

Chris fixed whererpms to work with the current LCFG client.

Operational

Server reboots
We're still doing them. The KVM servers also need firmware updates, which Chris will install at the same time.
KB SAN
Alastair and Chris will migrate VMs off the piccadilly and northern SAN partitions so that we'll no longer be holding up the Services Unit.
Troublesome VM migrations
We're not confident about migrating a VM with multiple disks or ethernet connections. Chris will try a test migration of the former to see what happens.
Drupal
There's a new Drupal version. Alastair will install it.
EPEL 6 mirror
We're still not mirroring EPEL 6 as we've run out of space (as we keep old versions, unlike EPEL themselves). Stephen will find somewhere else to put older versions of packages so we can begin mirroring EPEL 6 again.

This Week

  • Alastair
    • Order a spare 600GB disk for waterloo (hot spare)
    • Double check latest web security reports
    • systemd project
      • start writing in blog
        • document the debugging including stuff about disabling graphical boot
      • Modify lcfg components/rc scripts list as a result of COs talk.
      • Consider how components will work with systemd
      • Look at how component triggered reboots will work
      • Start designing a systemd target structure for LCFG components
      • Add support for configuring presetsI think running systemd after updaterpms is sufficient
    • EL7 project
      • Continue looking at installer
        • lcfg-installfixups
      • Think about boot.run
        • attempt to group common cron and boot.run jobs together via cron/anacron
        • ? dostuff component that will do stuff daily.weekly,monthly etc ?
      • move patched curl to correct header
      • Investigate what multithreaded resolver is... It's part of curl.
      • Updaterpms - Invert . / hdr order in httpget
    • Add more memory to Forum KVM servers? - 700 per server to upgrade 64GB -> 128GB- (which ones?)
    • Look at iplimit for computing.help
    • RT tidy
    • Read and comment on Virtual DICE report
    • Remove ordershost stuff from schiff
    • Reboot circle and bakerloo
    • Try SL6.5

  • Chris
    • Virtual DICE - complete final report
    • EL7
      • look at lcfg-kdm and dice kdm theme
    • url shortener
    • Reboot hare and jubilee to test firmware/BIOS updates and install kernel update
    • Try SL6.5
    • Try hotmigrating a KVM guest which has two attached disks

  • Stephen
    • LCFG client refactor stage 1
      • schedule debrief meeting
    • EL7
      • Survey what disks sizes we have and what is partition current usage (eg /var, /tmp)
      • PXE install
      • openssh patches
      • Continue thinking about boot.run functionality - following week
    • Continue work on SL6.5
    • Reboot both ssh servers (hogwood/kubelik) and telford and budapest
    • Write up daily security checks
    • Think about PD
    • RT tidy
    • Separate out EPEL6 mirror to current and legacy

-- AlastairScobie - 12 Aug 2014

Topic revision: r9 - 19 Aug 2014 - 14:51:48 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies