MPU Meeting Tuesday 11th January 2011

LCFG Server Refactoring

Stalled.

Software Build Farm

Stephen has written a lot more documentation for the Package Forge API - that is, from a programmer's point of view. That part of the documentation is now more or less complete. The exercise was useful in that it highlighted less than ideal aspects of the API which were then refined as the documentation progressed.

The user documentation has been started. There really isn't the need for a lot of this as most of it should be obvious from the web interface.

Documentation for administrators is still needed.

The component now configures everything that the submission tool needs. It also does most of the configuration for the daemon. It still needs its logging sorted out, and documentation.

You can now include the pkgforge-builder.h header, then just start the daemon. There are details to tidy up but it's basically functional.

Killing hung jobs and stopping the server still need some careful thought - when to take what action; how to account for all child processes and their child processes; and so on.

Additionally, some thought has gone into possible extensions which might be added later.

Package Forge should be up and running - in a test mode for computing staff - by the end of the month.

Replacement for VMware Server

In no particular order...

Alastair tried suspending the IS guest trailblazer : suspend works.

KVM too does suspend to disk if you use the correct virsh command.

The current version of the RHEV tools still uses Windows. A Linux version is promised for "2011".

Chris made a list of some things that will need doing for COs to use KVM.

Alastair has been writing the report and will send it to Chris for comments.

Miscellaneous Development

Stephen brought back some useful issues from the Users Day and from the Deployers Meeting:

lcfg-authorise
The "allow from console" facility is broken in F13 (bug 362). Michael Gordon contributed a perl script to show how to query DBUS and ConsoleKit so we can alter the component to use this method instead.
om
Support for running om methods doesn't work on Mac OS as there's no suidperl. You can get round the problem by installing a C wrapper, also called om, which executes the real om, and setting suid on the wrapper. We could do something similar for Linux so we wouldn't need suidperl. (After the meeting, Stephen discovered that suidperl has been deprecated and that we didn't need it anyway.)
Package lists
We have a problem with package lists. If you have a base list, then add other lists on top, then apply updates, you get conflicts if packages are in two places. It would be good to be able to "add a package if not in list already". You could do this in two ways:
  1. Put + on the front of all base packages, then prepend additions.
  2. Introduce a new | symbol to be used in package lists to mean "or" - that is, to mean "add the package only if it hasn't already been added". Shane has reported this (bug 367). The bug has a number of suggested patches.
Canonical package processing
Stephen would like one canonical implementation for package processing, written in C for speed, and usable from a number of languages. We could add this to the wee projects list.
Triple quote
To more easily enter a multi-line value into an LCFG file, Shane proposed (in bug 366) the adoption of the Python concept of a triple quote. Stephen reckons that it would be trivial to add this to the (refactored!) server parser.

In other news, Chris added the Dell PowerEdge 715 and HP DL180 to toohot and HP RAID monitoring to hwmon.

Chris will transfer the miscellaneous development tasks which we identified as priorities from the Wee Projects list to the Activities list. One of them has already been done ("Add Nagios support for the HP monitoring") - at least for RAID status.

Operational

Bakerloo still has two guests, crivvens and epping : Alastair moved the rest of them over the holiday. Chris will ask RAT and Services to move them.

Alastair tried KVM over NFS. Stephen suggested a look at CEPH, an open source distributed file system.

Stephen thinks we need to make a new F13 kernel this month. He'll show Chris what that involves.

Shane would like the option of using our DICE kernels in order to cut down the number of new kernels they have to install in GeoSciences. We could do this easily: our dice kernel stuff could migrate to the ed layer.

It would be desirable to be able to add spare desktops to the Hadoop cluster. Chris reckons that this would currently not be trivial (though it should be). He'll have a think about it.

Also on Hadoop, it would be interesting to persuade (for instance) Miles Osborne to give us an introductory Hadoop talk.

This Week

  • Alastair
    • check nagios monitoring of sauce pkgs (and configure if not already setup)
    • setup a VM to use dr.pkgs + document url in wiki
    • code review pkgforge
    • virtualisation report (pass to Chris for comment)
    • put my desktops on MPU wiki
    • Consider personal development
    • Ask RAT whether Condor is still required (not currently on F13) - done, awaiting Tim's consideration...

  • Chris
    • ask RAT and services to move epping and crivvens off bakerloo
    • check added HP model to "toohot"
    • Consider personal development
    • RAT submit project
    • Move wee projects marked Dec 10 to activities list

  • Gordon
    • Meet with Alastair to go over bugs

  • Stephen
    • pkgforge
    • Consider personal development

-- AlastairScobie - 11 Jan 2011, ChrisCooke - 12 Jan 2011

Topic revision: r4 - 12 Jan 2011 - 17:21:19 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies