Increasing Energy Savings from DICE Desktops: final report

Devproj 234.

Hours budgeted and used

T3 2012: 9 hours T1 2013: 54 hours T2 2013: 7 hours

The project was allocated 2 weeks of effort - 70 hours - in T1 2013 and used exactly this amount in total.

What the project was expected to produce

  • A data feed of statistics showing the sleep rate of DICE desktops
  • Possible improvements to the LCFG sleep component
  • Possible improvements to the energy-saving guidance given to users of DICE desktops.

What the project produced

  • An automated daily report of sleep & wake events on DICE machines.
  • The daily report is from time to time used to make graphs of machine sleep over time (see figure 2). The graphs are made by David Sterratt.
  • The LCFG sleep component can now find out how recently a machine's keyboard and mouse were used. This is a crucial bit of information to have when figuring out whether or not it's safe to suspend a machine which has a login session still running. Until this change was introduced the component would refrain from suspending machines while a login session was in progress, meaning that many office desktops machines, where a single login session can last for months, would rarely or never sleep. This change produced a noticeable decrease in the Forum's power consumption (see figure 1).
  • Following helpful comments from test users the sleep documentation was revised to clarify some points of confusion.
  • The sleep component's code is as ever in the LCFG Subversion repository.

Follow-on work

There are a number of small jobs which flow from this project and which should be tackled as time permits:

Handle 'at' jobs
The sleep component has since its inception parsed and sorted the times of cron jobs in order to ensure that the machine wakes in time to run them where necessary, and sleeps safely at other times. This was never done for 'at' jobs because of the lack of a handy perl module to parse the 'at' command's time specification, which allows a wide variety of colloquial expressions of time and date. Stephen Quinney has now found and installed such a module - DateTime::Format::GnuAt - so the component could now be changed to enable it to parse waiting 'at' jobs to find out when they're due to run, then (as it does for cron jobs) sleep the machine safely then wake up in time to run the job. This would slightly increase the amount of sleep of most of our machines. Until this has been done the sleep component will continue with its current behaviour, which is to refrain from sending a machine to sleep whenever the 'at' queue has a job in it. The 'at' command is used by the autoreboot component to trigger automatic reboots.
Session manager shim
The sleep component's detection of keyboard/mouse idleness, introduced as part of this project, depends on the user having used the GNOME environment, or at least the GNOME display manager. Most people use GNOME but some choose other environments such as KDE, fvwm, Ice or Xfce. Graham Dutton has produced a substitute "GNOME display manager shim" for users who do not use GNOME. This gives the sleep component enough information for it to be able to detect keyboard/mouse activity on these machines too. It needs to be installed and documented to enable sleep-while-logged-in for these users in as convenient and automated a way as possible.
Try the hybrid suspend-and-hibernate sleep method
With the recent power failure in the George Square area and with reports of a shortage of electricity in the next few years it might be wise to take more steps to make machines more tolerant of power failure. We might achieve this by using the hybrid suspend-and-hibernate sleep method rather than the current "suspend" method. Once the OS supports it we should try using suspend-and-hibernate. The suspend method keeps the machine's state in memory which remains powered up while the machine sleeps. Machines take only a few seconds to suspend or resume in this way but sleep takes more power and the machine's state is lost if power is lost. The alternative "hibernate" method saves the machine's state to disk. This allows the machine to sleep more economically as less components are still powered up. It also means that if power is withdrawn from the machine altogether, then restored some time later, the machine can still recover to its exact pre-hibernate state. The down side of using "hibernate" is that it is a lot slower than suspend, typically taking a minute or two in each direction. However the hybrid "suspend and hibernate" method - available on some operating systems (e.g. MacOS) but not yet functional in Scientific Linux - suspends the machine in the usual way but at the same time uses hibernation as a backup recovery method should power fail during sleep. Using this would mean that entry to sleep would take longer but waking should still be quick, and sleeping DICE machines would be able to return to their pre-sleep state even if power had been lost during sleep.
Tidy old sleep events from BuzzSaw
DICE machines' sleep and wake events are now recorded in the BuzzSaw database. Although the sleep events don't directly identify individual people they do count as personal data, since they could be combined with inventory data to deduce information about the probable computer use patterns of individual people. It's therefore necessary to anonymise, delete or justify keeping these records in the database. It doesn't seem necessary to have several years' worth of sleep events in the database anyway, and having them there could potentially slow queries. For these reasons we'll need to establish a procedure to clear out older events and keep anonymised aggregated data elsewhere.
Regular automatic graph production
David Sterratt's graphs (see figure 2) could be produced automatically at regular intervals and made available via the web, as the Infrastructure Unit does for much of its data. This should be quick to do (we have David's code already) and should provide a range of graphs which should prove not just interesting but which should also allow us to monitor changes in sleep patterns easily and promptly.

Points to note

BuzzSaw and logger
BuzzSaw makes it easy to automatically gather syslog data from any number of machines, search the data for events of interest, and generate regular reports based on those events. In addition /usr/bin/logger can be used to insert arbitrary data into syslog (as the sleep component does, see lcfg/options/sleep.h). The combination is simple and powerful. Information on logger can be found in its man page. BuzzSaw documentation can be found in the following places:
  • In the BuzzSaw/docs directory of the LCFG source repository.
  • By logging in to the system log host and typing man -k BuzzSaw.
  • Stephen Quinney helped a lot in getting BuzzSaw and lcfg-sleep working well together, to the extent of reimplementing parts of BuzzSaw.
Rethink
Before the project started the idea had been to produce statistics showing how much DICE machines were sleeping, then figure out from those what might best be done to improve sleep rates. However it was fairly obvious in advance that giving the sleep component the ability to detect recent keyboard/mouse activity during login sessions would give good results as that had for a while been an obvious lack in the sleep component's armoury. When it also became clear that there would be a delay before BuzzSaw could be used, the sleep component improvement was tackled first. Despite being "the wrong way round" this approach seems to have worked.
Beta testing
The introduction of sleep in the midst of login sessions was judged to be radical enough that it was decided to test the change with users before introducing it across the board. A group of testers was recruited. This proved very useful. The test team suggested improvements to the software, but in addition their questions and suggestions led to a number of improvements being made to the documentation, reducing or eliminating some confusing or contradictory points. Improvements were also made to the configuration of the sleep component as a result of the beta testing. I'd like to thank the beta test team for their help and ideas: David Sterratt, Maria Walters, Allan Clark, Stella Frank, Erik Tomusk, Amy Isard, Alison Downie, Bob Fisher, Sander Keemink, Elaine Farrow, Perdita Stevens, Matthias Hennig, Fergus McInnes, Miles Gould, James McKinna, Douglas Howie and Kira Mourao.
Non-event
Possibly thanks to the beta testing, the actual introduction of sleep during login sessions was something of a damp squib. Very little was observed other than a marked increase in the number of machines sleeping.

Figure 1

A graph of Informatics Forum power consumption for the week including 2013 week 22, when consumption fell as a result of the installation of the new version of the sleep component.
ForumPower-mm.png

Figure 2

A graph showing how many DICE machines were sleeping in the week of 25/05/2013. The steady increase in sleeping machines in the early hours of Thursday morning was caused by the new feature being installed on DICE desktops and those machines subsequently falling asleep.
sleep-stats-2013-05-25.png

Topic attachments
I Attachment Action Size Date Who Comment
pngpng ForumPower-mm.png manage 20.2 K 09 Jul 2013 - 13:50 ChrisCooke A graph of Informatics Forum power consumption for the week including 2013 week 22
pngpng sleep-stats-2013-05-25.png manage 26.8 K 09 Jul 2013 - 13:53 ChrisCooke A graph showing how many DICE machines were sleeping in the week of 25/05/2013
Topic revision: r4 - 25 Jul 2013 - 16:00:48 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies