MPU Meeting Thursday 29th May 2014

Virtual DICE

Toby has now successfully made his LDAP change, so the people tree is visible from EdLAN.

Chris will test Virtual DICE logins and rewrite the documentation to match.

LCFG Client Refactoring

Nothing happened.

LCFG systemd

Alastair has discovered a problem with boot time. It appeared when he added systemd configuration for LCFG components. The total boot time with LCFG client (dependent on completion of the network and basic targets) and up to four additional LCFG components (dependent on completion of the LCFG client unit startup) was about 15 seconds. That was about the same time as for a machine with no LCFG. However if a fifth component was added, also dependent on LCFG client, the total boot time would increase to a minute or more. That seemed to be the case no matter what other LCFG component was added - although all such components were given exactly the same dependency, on LCFG client. Alastair found timing information in the systemd journal and graphed it. It seems that the addition of the fifth LCFG component causes systemd to pause for a long while before starting anything. All of the additional time passes before anything is started by systemd. When units are eventually started the boot is as swift as before.

Is the delay caused by systemd working out dependencies? Is it a bug? Alastair does not yet know, but rather than find an easy way round the problem he thinks we should persist with this configuration until we understand what's going wrong: we're going to need systemd debugging skills in the years to come.

We will discuss systemd at the next LCFG Deployers monthly meeting.

LCFG Port to RHEL7 or compatible

  • Stephen has finished his new Service method. Components can use it to start, stop or otherwise control daemons. It works with SysVinit/upstart (SL6) and systemd (EL7) and can safely be used with launchd (MacOS X). The documentation is on the ControllingDaemons page on the LCFG wiki. The cron component now uses the Service method. Other components will be converted to use it as necessary.

  • Our RPM build dependencies on EL7 are different to those on SL6. To simplify the changeover Stephen has provided a virtual lcfg-build-deps package. It pulls in all the required dependencies for most LCFG components, so your RPM doesn't have to. FInd out more at BuildDependencies on the LCFG wiki.

  • The inf layer has been changed somewhat. The documentation will be updated to match.

  • A while ago Stephen wrote the accessconf component. It allows us to manage access.conf files more flexibly than the auth component can. As part of the move towards managing all such files using accessconf Stephen has changed the auth component. If the auth.accessconf resource does not have a value, auth will now back off and not attempt to manage access.conf. He has also put some effort into making sure that the traditional auth.users resource can nevertheless still be used to give people login permission.

  • Chris will make a new wiki page showing the EL7 porting progress. Talking of which, these still need work:
    • lcfg-auth
    • lcfg-fstab - we'll add XFS support. We expect that Informatics will default to using ext4.
    • lcfg-lcfginit
    • lcfg-mail
    • lcfg-network - we think we can survive without this for a while so it's not our top priority.
    • lcfg-defetc-el7
    • lcfg-etcservices
    • pkgsubmit
    • the installroot / installbase package lists.

Miscellaneous Development

  • Stephen found that om would not accept an underscore in a method name. (This was despite it accepting one in the component name. The component name and method name should be treated the same way - it should be possible to map either to a shell or perl function name.) om has now been changed so that allowable component and method names are now consistent.

  • Alastair has tweaked the LCFG Updaterpms warning script to fix a problem with its date calculations. It should now more accurately list only those machines where updaterpms has not managed to run to completion. This includes machines on which a package can only be installed at boot time.

Operational

  • OpenAFS 1.6.8 is now out. The latest RHEL kernel is free of the bug which broke OpenAFS. We will aim to deploy both together in the 18th of June stable release.
  • The package master functionality will soon be moving from brendel to bruegel.
  • After that the LCFG master function will move from schiff to steen.
  • staff.nx.inf.ed.ac.uk is almost ready.

Next Meeting

Monday 2 June at 2pm, to talk solely about the EL7 port and systemd.

This Week

  • Alastair
    • Order a spare 600GB disk for waterloo (hot spare)
    • Double check latest web security reports
    • systemd project
      • start writing in blog
      • Modify lcfg components/rc scripts list as a result of COs talk.
      • Complete lcfg-systemd component - install method
      • Consider how components will work with systemd
      • Consider journald
      • Look at how component triggered reboots will work
    • EL7 project
      • continue process of managing components using systemd component
      • consider relocating /var/lcfg/status and /var/lcfg/lock -> /run
      • put systemd config into el7 level
    • Add more memory to Forum KVM servers? - 700 per server to upgrade 64GB -> 128GB

  • Chris
    • EL7
      • Continue looking at systemd
      • Resubmit failed auto build packages
      • Create and populate EL7 components status page
      • lcfg-mail component
      • lcfg-fstab - add xfs support
      • pkgsubmit
      • installbase package list
    • open up staff.nx and announce (check identical to existing nx service)
    • Continue work on new LCFG master and package masters
    • Update Virtual Dice documentation as a result of Toby changes

  • Stephen
    • LCFG client refactor stage 1 -> activity page
      • schedule debrief meeting
    • LCFG client refactor stage 2 -> activity page
      • continue development and docs
    • Check with SEE what they did to improve NX performance -> activity page
      • make any easy changes
    • EL7
      • ed/dice flavours of inf level
      • Complete lcfg-auth / lcfg-accessconf transition
      • consider relocating /var/lcfg/status and /var/lcfg/lock -> /run
      • lcfg-defetc and lcfg-etcservices, lcfg-release-el7
    • Roll out latest kernel and openafs - for stable release on 18th June.
    • Reboot hare to test firmware update
    • Pandemic stuff
      • discuss school db with Graham/Tim
    • Write up daily security checks

-- AlastairScobie - 29 May 2014

Topic revision: r6 - 02 Jun 2014 - 15:25:25 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies