DICE environment for SL7

Description

Port the DICE environment software packages to the new LCFG SL7 platform.

This involves selecting, building and in some cases modifying pieces of software required to achieve rough equivalence to, or improvement on, the existing SL6 platform.

The first round of work was to ascertain what precisely was part of the DICE environment (and what was not, for example research and teaching packages, or RAT unit specific upgrades, both of which were covered in separate projects). Having done this, the second phase was to port and implement that environment.

The environment was to be tested, first on CO, then non-CO users.

Early work was described in SL7DICEEnvironment before development reverted to normal tracking of individual pieces of work through bugzilla and RT.

Customer

Internal / Computing.

Deliverables

  • The list of environment packages was defined and documented technically in the set of LCFG headers dice/packages/dice_sl7*_env_*.rpms.
  • The second deliverable was to port or acquire those packages (or alternatives) to DICE SL7.
  • An additional deliverable emerged during testing which involved writing various pieces of supporting software to enable the use of the LightDM display/session manager. This was due to critical bugs and missing features in GDM (some of which have subsequently been fixed), required for the laboratory examination.

There was no deliverable associated with the testing effort, but it was performed incrementally by rolling out the new platform cautiously to new users with provisos on its use, and soliciting feedback via the usual routes (i.e. RT).

An extended deliverable emerged during the work to augment the yummy tool, but this was deferred.

Time

The bulk of the effort took place between Nov 2014 — Nov 2015.

Time was tracked in the usual way and amounted to ~10.5wks (~364hrs) FTE. This figure is likely to have significantly under-represented the effort required, but when taken with the other SL7 projects might give a more realistic impression of the work performed.

This effort does not necessarily include time spent by other COs working on their own software to assist the port - by its nature this project pulled in effort from colleagues either to build software (or check my builds) under their control, or to consult on expected usage of categories of machine. In particular the LightDM work was hard to categorise (see below) and the majority of this work was not tracked in this project (though may have been specified elsewhere).

Observations

Definition

Defining what constituted environment was difficult and, following a development meeting discussion, no firm conclusions were reached. In the absence of clear guidance, the environment was defined as the minimum configuration and packaging required to make an LCFG-managed SL7 machine into DICE SL7, including textual and graphical shells and their customisation. The idea that there might be a baseline level of configuration - distinct from LCFG but common to all DICE machines - was a notable change from SL6.

Some time was also spent deciding on a window manager. It was decided that GNOME 3 (consisting of the "Gnome Classic" and "Gnome Shell" desktop environments) should be used. Given that it was the upstream default and theoretically the default upgrade path for GNOME 2.x users, it was deemed to be the natural choice.

SL6 origins

The OS upgrade was taken as an opportunity to learn from the problems of the outgoing setup (no attempt to alter the SL6 package lists was made, as these were clearly to be deprecated). The starting point of the work was therefore the dice_sl6_env package list.

It was noted that the SL6 environment contained many packages which went far beyond the basic environment one might have expected to find on all machines. Thus, the (generalised, non-project) goals of reducing server attack surface, optimising disk space usage, etc. could be assisted by pruning large numbers of packages previously defined as environment. For example: the "alpine" mailer was previously included on all SL6 machines; though small and low-impact there was no good rationale for its inclusion on every single SL6 machine. In SL7 it was determined that only machines intended for interactive use (so, excluding web servers, but including login servers, for example) might require a mail client. Alpine was therefore applied to the _user package list, included only by machines expecting interactive use.

This understanding helped define, from three package lists (core, interactive and graphical) two "dimensions" of DICE hosts, non-interactive / interactive and non-graphical / graphical (but for simplicity's sake, it isn't possible to produce a graphical but non-interactive host, and as yet there's no clear use-case for such a combination).

It was simplest start with the non-graphical side (effectively the bash shell) and at this level it was clear that much of this was covered by the bashdefenv system and minimal patches to the shell itself. So the work required to port this consisted largely of testing compatibility with bash and the bashdefenv RPM. Once this was done it was clear that most of the configuration belonged to packages outwith the environment (i.e. plugged into the bashdefenv system) and so beyond testing for forward-compatibility, this was mostly out of scope.

Having decided upon the new default desktop environment, the graphical environment turned out to be the opposite extreme: so much had changed that very little SL6 customisation, all of which based around GNOME 2.x, was compatible or required. It was decided that the best approach for the environment was simply to log faults in the usual way, accepting that equivalence wasn't a desirable outcome.

Package management

For ease of ongoing software maintenance, the new package lists were further split into two subtypes, in effect determining whether versions were to be automatically updated by upstream RHEL/SL or kept manually updated. The aim was to reduce the latter as much as possible. This does make maintaining the lists slightly more complex as packages could be placed into one of six lists. This strongly indicated some work on the established tools for CO-side package management would be required.

As a result of the six lists, a new deliverable was added to to the project (and subsequently deferred as not necessary for completion, but nice-to-have): improvements to the yummy tool, and distribution of the yummy workflow scripts currently used by MPU at OS point-releases and updates. The idea was to make the yummy lists the master files for all CO package list editing. This would reduce CO effort in maintaining package lists far beyond just the environment project.

The work consisted of improvements to yummy's parsing to handle all the detail currently mastered in the package lists, and to generalise the MPU-specific yummy workflow scripts and LCFG structure to allow COs to use them on any sets of packages without loss of information. It remains a desirable improvement which could be performed at any point in the future.

As a stop-gap measure the yumtopklist tool was updated and distributed to the CO utils area — this new version segregates updates by source, so that COs no longer need to work out which packages need placed into which list.

Packaging

The "main" portion of the work, porting and packaging the environment software, turned out to be comparatively small as the previous deliverable cut out much of the environment. Most of the packages involved were trivial to rebuild, depending on largely stable software such as bash. Much of the more complex packages were deferred for - or shared with - other projects such as SL7 R&T, SL7 RAT, SL7 server or even SL7 lab exam environment.

Larger package sets such as LaTeX and the Desktop Environments (GNOME, MATE, KDE, etc.) took a little effort to manage dependencies cleanly, but required no porting per se, as the packages were provided (largely as yum groups) by the upstream OS. They required individual testing and fixes to small but critical issues such as keyboard mapping.

Testing & Support

Testing revealed several problems with the DICE environment on SL7(.0) including greeter, window / display manager and other fundamental differences. As of SL7.0 some of the GNOME infrastructure seemed to be in a very poor state. What's more it turned out to be virtually impossible to test the various desktop environment configurations available to users, other than to ask for beta testing and feedback and to seek out users of each environment for early testing. Moving COs early onto SL7 was an important part of this.

It was assumed during the start of the project that GNOME 3.x would be the default and only supported window manager on SL7. However on testing, as of SL7(.0) several problems presented themselves with the GNOME 3.x environment which cast our choice as less of a good fit than anticipated. GNOME 3.x had several bugs which prevented "supported" pieces of software from working, and indeed by design the new release removed several features on which upgrading users used to rely. This meant that in several cases the only way computing staff could support users was to recommend a switch to MATE, a maintained GNOME 2.x fork. Furthermore it transpired that fewer computing staff than before had any practical experience whatsoever of GNOME 3, which made implicitly supporting the environment rather harder.

(By way of background, in SL6 there was one officially supported desktop environment: GNOME 2.x. In practical terms this meant that, should users encounter any problem with any piece of software whilst using anything but GNOME, frontline support's first suggestion would be to revert to the supported desktop and try again. This also allowed us to build up a knowledge base of known problems and workarounds for common tasks within the GNOME environment.)

All of the above led to an enthusiastic but largely unresolved discussion of what precisely it means to support a desktop environment, or indeed any piece of software. It was agreed (within RAT at least) that we would have to be more pragmatic about relative levels of support, and that supporting the use of software wasn't the same as guaranteeing solutions to any particular problem on any particular WM. In effect, "we support our users, not the software".

However, support or not, it became clear that it was important to maintain access to alternative desktop environments both to suit users' needs as well as to provide maximum software compatibility.

LightDM

This was an additional and fairly large piece of work which wasn't accounted for in the original descriptor but was nonetheless identified by RAT and MPU to be absolutely necessary to the SL7 port. Details of the work involved are well documented elsewhere and accordingly excluded from this report. A good starting point is the LoginScreens and SwitchDesk pages.

Out of Scope

We intentionally excluded the R&T software / environment, and this was a reasonable exclusion, not least because project 353 was subsequently defined to fulfil this. Other projects carved out their own niche and to some extent this project ended up being defined as those bits not claimed by the others (though it probably did not go far enough — see conclusions).

Early on in the project it was identified that a more holistic view of the graphical environment was to be excluded from the project, and this was reasonable given the definition of this project, but at that point a new project should have been defined to deal with this.

An important part of the DICE environment is the LaTeX infrastructure, and although this was effectively ported/upgraded (using the default upstream TeXLive distribution in both SL6 and SL7) it was later found that the latter was inadequate in many ways. An immediate workaround was sought, and the longer term effort to replace this with a more fully-featured LaTeX distribution is not covered by this project.

At one point it wasn't clear which of Cinnamon or MATE should be maintained, and package lists for both were produced. It was decided that the effort of maintaining both was excessive, and Cinnamon (the lesser supported within the RHEL ecosystem) was dropped.

Conclusions and future Work

The work identified - filling the environment lists - was completed to our satisfaction as far as I can tell. But equally it could not be said that the SL7 desktop transition has been entirely smooth and I would lay a great deal of the blame against the early specification phase of this project. Indeed the overarching motif of this project was one of borderline-philosophical discussion over basics aspects of the computing provision. This was almost certainly not the correct place to have any of these discussions, but highlights that the project specification (and meta-project planning) could have been improved. (That, or possibly an unmet need for such philosophical discussion within the computing body!)

Unfortunately it was not clear (in any event is clearer in retrospect) what project completion would look like beyond the rather mechanical "all packages ported". So I'd strongly recommend that future environment upgrade projects take a more holistic approach - splitting into two projects, perhaps; a child project covering software / package list ports, and a parent project shaping what a user desktop should actually consist of, and assessing the overall user experience of moving to the new platform. Migration problems including (seemingly) trivial desktop issues such as screensaver were overlooked until after many users had migrated, and this project would have been a good time to have caught such things. Indeed it seems that, though users' needs are usually addressed through iterative development and early release, there's maybe a place for more explicit user advocacy in this project if not others.

The last recommendation (largely to myself, and not for the first time) is to write the final report concurrently with the work. Otherwise, as has been the case for this project where there's been an undue (and only partly avoidable) delay between work and report completion, the report will have to be constructed "forensically" from notes, and there's a risk that many of the lessons learned - and reasoning behind the decisions made - will be lost.

-- GrahamDutton

Topic revision: r4 - 23 Jan 2019 - 13:44:28 - GrahamDutton
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies