2014 T3

25th August 2014 --- 28th December 2014

Computing Projects

The unit only achieved 14% of its time on specific computing projects. Note that much development was not included in this and was significant, including: InfHR Theon Migration; mandatory changes for Direct Admissions and Tier4 Engagement; migration of remaining legacy database processes (PAVD and PGT conversion), GPU and CDT related procurement. Much of the DICE SL7 project effort was not included either due to error. There was also significant operational work such as: cluster decommissioning; RT upgrade; complex teaching software installations (e.g. Axiom).

Plans for 2015 T1

Principally this is to increase time spent on computing projects - to date this is 28% (not including CDT related projects). We are trying some different approaches to managing time to achieve this. Another BAD-RAT day is planned as the first one was successful in clearing low hanging operational and development tasks.

Continuing Professional Development

This is a summary of recent CPD activity carried out by unit members.


(Now in headers; unfinished of course)

After a bit of fiddling about last year, returned to this and got it working and (mostly) configured using LCFG. It's Yet Another Graphing system, but its selling points include use of tried & tested RRD at the back end, virtually no server infrastructure (just a daemonless, polling "master" system and super-lightweight CGI frontend). It also is super-easy to get lots of graphs out of it "out of the box" including lots of PostgreSQL-specific plugins that appear to have been written by one of the postgres maintainers at one point, which may or may not say something significant!

Nodes can be added on SL6 and SL7 and, despite a not-totally-functional SL7 apacheconf (there appear to be problems with fastcgi amongst other things) I've set up an SL7 master using an improvised Python-based CGI server. Ask me for a demo...

Limitations? Not too many. If apache is handling the CGI then we're good for viewer authentication. CGI -> node auth is by IP using a telnet-like interface; that could probably be replaced with a tunneled or otherwise Kerberised interface I'm sure.


(Now in LCFG repo)

Another blast from the past, this is me trying to plug the gap in the infrastructure between Chris' sleep component (which uses the standard SessionManager interface to detect activity on an X session) and more lightweight window managers (such as fvwm and wmii) which don't use these mechanisms. I had something partially working on SL6 but never put it together formally, and SL7 gives me an opportunity to formalise this.

Working with git/github

(not really tangible)

Last year (definitely outwith this CPD period) I wrote a piece of software to replace fan speed controller firmware on my laptop (having replaced its hard disk, the Mac refused to read temperatures off the drive and engaged an "emergency mode" with both fans spinning at 6000RPM; the controller chooses an appropriate fan speed based on the CPU temperature).

The relevant part to this CPD is that I decided to refine and publish this script on GitHub as a test project to get a sense of how git, GitHub and the publishing process worked in practice as an author.

What can I say about this? I'm not sure it's been much of a trial, but I succeeded, mangled the code and a few local branches, and had a go at rewriting history, as you're not meant to do when gitting. I don't think this is a process that can be considered "done" given I'm still learning SVN...

GHOST mitigation

(an unused RPM and two homepages)

Definitely not within RAT's remit, but since we're splitting hairs I'll call this CPD: after all I could just have sent a few text messages. I happened to be online and the first to see the GHOST vulnerability, so went through the motions of finding the glibc patch from the libc repository and applying it to our most recent sources. At the same time (glibc is slow to build) I worked on a small chunk of code to test the vulnerability (easy, mostly copy & pate), and also to test that normal lookups still worked post-patch (much harder, some of the gethostbyname functions are really hard to use and minimally documented). Having done all of this and developed a sense of the level of threat (nowhere hear shellshock...) I installed it to prove I could, then abandoned the project.

As a secondary bit of CPD I expanded my long-standing "rat-servers" script which lists all LCFG-defined servers in the RAT unit group, and used it as part of a tool to let us track future glibc and kernel updates across our servers (and other units' should they choose to take advantage).


I've been playing with this on and off for a year or so. partly as a possible replacement for GPFS. Having chatted with Graham we've got a trial set up to use it to replace NFS for the exam machines and it could conceivably replace other NFS implementations. Main activities in the Sept-Dec timeframe were upgrading to 3.4.2, doing some benchmarking and helping set up Grahams test implementation


This is a mad scheme to internally tweet host status information and use data mining type tools to look at message trends. So far I've got gnu social(open source twitter like thing) up and running as little bird.inf.ed.ac.uk ( not even not a service) and have started writing something to tweet lcfg status messages from hosts. This is very long term, very hand wavy and is a rejected idea from the innovation week some years ago. In Sept I came across a library called atom which provides an API to social media.

This has largely stopped for the moment because of: Fedora 21 (below)

Performance tuning/monitoring

Continuing on from the stuff I wrote up before. This has three threads Firstly I've been looking at getting more monitoring information using various tools (the stadard iostat/vmstat etc) and some linux specific ones (perf) with a view to generating histoical data. This has resulted in a more up to date (and virtualised) ganglia test setup ganglia.inf.ed.ac.uk which has more lower level monitoring information though plugins. CPD in November was mainly setting up ganglia.inf.ed.ac.uk

Secondly I'm thinking about automating the benchmarks previously generated so that we would have a database of relative performances of kit, This hasn't got much further than shortlisting some fairly standard benchmarks and realising that this has hooks in the inventory project.

Thirdly I spend a chunk of December looking at various caching strategies: using sshd as caches, looking at how the afs cache is used and how the various memory caches are used.

CPD in November was split between setting up ganglia.inf.ed.ac.uk and looking at Bcache and the afs cache.

Fedora 21

I use the "odd" releases of Fedora on my laptop and 21 was looming in December. Given that we're now advised to encrypt laptops and as I wanted to play with bcache and suspending to SSD I spent some time in December doing test installs, I found it was relatively easy to encrypt filesystems at install time, that bcache was fiddly to set up, that openafs doesn't with kernel versions after 3.17.4 and that suspending to SSD is not as easy as you would think.

Bootstrap CDN

Investigated using the Bootstrap fonts (CSS only) and CDN as a better way to handle website presentationm, specifically for targeting multiple viewing devices. Its very slick although requires Javascript for some of the more fancy stuff. Rolled this out for the new Computing Projects web site and very likely to use it for any future Theon Portal/UI refresh.


Played with the use of SVG files under browsers - rendered as scalable vector graphics directly (following on from a hint from Graham). This works very nicely and DOT files can be converted directly to SVG for this purpose. Intention is to use this for the automatically generated ER graphs for the Theon Model documentation (these currently produce DOT files which are converted to PNGs).

Presentation Media

Tried out the next best thing for presentations since Powerpoint following on from an IS investigation - Prezi. Quite nice with integrated smooth animations. Suffers from all the same failings as Powerpoint.

PostgreSQL Extensions

Investigated (principally just reading the documentation at this stage) how to write PostgreSQL extensions and modules. This with the aim of replacing the Theon eventd process with a standard background worker (pgsql v9.2+) and also with integrating the gurgle report generator as a module - with the advantage of report rebuilding being able to be automatically triggered by standard database rules and events.

LCFG Local Configuration

Continuing to look at "isolated" LCFG instances for configuration (of Theon specfically but would be more general than this). This is so that headers and profile could be shipped with sxprof and rxprof for a standalone LCFG configured service that does not require any LCFG infrastructure (server etc). This would make services configured by LCFG more portable and easier for others to deploy on individual non-LCFG managed machines as tests. The basics of this work now, standalone header and profile compilation (with some mods to the compiler by Stephen) but more work needs to be done in constructing the filesystem hierarchy of components etc to get it up and running standalone.

-- TimColles - 09 Mar 2015

Topic revision: r1 - 09 Mar 2015 - 09:55:39 - TimColles
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies