MPU Meeting Tuesday 20th November 2012

Server Upgrades

Following the bugs.lcfg.org upgrade Chris has documented some aspects of managing and upgrading a Bugzilla server in ManagingBugzilla.

Next: Stephen will upgrade the LCFG website and Chris will upgrade brendel. Some tips for the brendel upgrade:

  • Use AFS 1.4, not 1.6
  • Use Toby's newer version of the Perl AFS module package. Make sure it's in the world bucket rather than in devel.
  • Test everything on an SL6 box first.
  • Use a copy of the keytab from brendel rather than generating a new one as the new one would automatically invalidate the one currently in use!

Craig is very soon going to move our git not-a-service off nelson. The only thing remaining on it will be gerrit. This is only being used by the Prometheus project at the moment. We'll suggest that Infrastructure take over gerrit.

Server Hardware

The BMC upgrades have appeared correctly on the firmware report. Chris will add a query to expose the contents of the subunit table. He's half way through writing up the project.

Security Enhancements

Stephen has been working on the documentation for the various components produced as part of this project. He has documented the auditd, auremote and rkhunter components and is working on the aide and buzzsaw components. After that he'll work on higher level documentation which describes how these components are being used and how to manage them.

While working on the docs Stephen had some insights into how to improve the remote logging setup and as a result he's added new methods resumelogging and logstatus. When the network connection is lost the machine will start backing up its logs safely and once the network comes back the resumelogging method will make it resume its usual remote logging. The logstatus method sends a status message to syslog.

There was some discussion of the error messages we've been seeing from tycho :

Parse failure: Failed to parse RFC3339 timestamp in line:  [try http://www.rsyslog.com/e/2144 ]
These are caused by the misplaced helpfulness of rsyslog. It's very difficult to weed out these messages and they're not an indication of anything badly wrong so we'll put up with them for now.

Inventory

Nothing happened this week. Alastair will rework the proposal in time for T1.

Absence Reporting

Chris has explored the possibilities and reported back to Liz. The option of an intelligent email address which could forward mail to the line manager has had to be rejected as we don't keep enough email addresses to do it reliably, and anyway it would be open to spoofing. By contrast the WebMark based web page option looks secure and easy to implement. There is one obstacle to achieving this though which is the availability of line manager data in the database. This is not going to be available for some months and Liz has influence over its prioritisation, so she's happy to draw a line under this project for now and revisit the idea once the line manager data has been made available. Chris will make a written report available then suspend the project.

Miscellaneous Development

There was none this week.

We talked about Chris's notion of easily tying X idleness detection to the sleep component by means of a small user-run C program which creates or destroys a temporary file in order to indicate idleness or its lack. Stephen suggested that it would be wise to provide some visual control over this, specifically to stop sleep temporarily during the current session (rather than stopping it totally). Menu items can be provided fairly simply by dropping a .desktop file into /usr/share/applications and RAT has done this already so has some experience to call on.

Stephen will take a look at adjusting the remctl access to wake so that computing staff can wake anything and to provide a command line equivalent to the web page.

Operational

Upgrade of circle
Following an accidental AT server room power shutdown, Alastair upgraded circle to SL 6.3.
PkgForge backups
Chris spotted a problem with them which Stephen fixed. The postgresql component hadn't been doing the backups as we had been calling it in an obsolete way from cron. The errors from this had been logged at INFO level so hadn't made it to rootmail.
Backup Check and Data Audit
Chris is going to broaden his backup checks to a complete MPU data audit (where it all lives and whether or not it is or needs to be backed up). At the same time he will convert resources to use the proper rmirrorclient macros, which are documented here.

This Week

  • Alastair
    • Rework and firm-up inventory project proposal
    • Create project for T1 for the wee project bundle
    • Try subscribing to more Dell models to see whether do get mail re firmware
    • Ship new lcfg-lvm component
    • Systems blog article on the KVM Service
    • Report libvirt empty LVM group issue to Redhat, unless fixed in 6.3
    • create an MPU KVM server header
    • Document ssh keys mgmt - windows
    • Stargaze
    • What jobs can we give to our CSO?
    • Reboot metropolitan
    • T3 figures so far
    • Speak with George/Toby re gerrit (on nelson)Toby happy to take on as not-a-service

  • Chris
    • Upgrade brendel -> SL6
    • Finish off server hardware final report
    • Convert MPU profiles to use RMIRROR spanning map macros
    • MPU Data audit

  • Stephen
    • T3 figures so far
    • LCFG site -> SL6
    • Continue documentation for Security project
    • Think of top 6 weeproj projects for wee proj project
    • Redirect openafs2012.inf to the conference archive location
    • remctl access to the wake service for computing staff

-- AlastairScobie, ChrisCooke - 20 Nov 2012

Topic revision: r8 - 26 Nov 2012 - 10:23:08 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies