MPU Meeting Tuesday 1st March 2011

Software Build Farm

Whilst setting up the PkgForge service for SL6 it has become clear that it would be nice to have a new platform be "active" but not in the default list of platforms. This would make it possible to explicitly register build tasks at SL6 without it getting every job where no particular requirement was expressed by the user. Stephen has modified the code to work in this way. Although not essential for the service to function correctly the new feature will work better with an update to the job submission tool. Due to this the new feature is turned off until all clients have updated their packages (should be ok on Wednesday 2nd March). Once this is done Alastair will submit all the LCFG SL6 "standard" packages for building.

SL6

Stephen has configured mock for building SL6 packages on F13. This was then used to build the SL6 LCFG "core" packages.

Alastair installed SL6 and configured it to use Informatics LDAP, Kerberos and AFS services.

Alastair has generated the LCFG base list using the SL6 "minimal server" install option with a few big packages, such as java, removed. Stephen is going to look into reducing the number of Perl modules included by default (currently it is 60 packages). Next step is to create the installbase list. Previously this was done as part of the installer development phase, this time we are using the F13 installer to bootstrap us into an LCFG-managed SL6 system so we need that package list a lot earlier.

As part of creating a mostly functional SL6 LCFG profile Stephen suggested scanning all the LCFG headers for LINUX_F13 macro usage. Nearly everything which applies to F13 will also apply to SL6.

Alastair has checked over the upstart configuration and sorted out the boot-time ordering for the various services we require.

Problems with how the LCFG auth component manages the /etc/passwd and /etc/group files were discussed. Currently this is totally managed by LCFG using resources and a template which is provided in a platform-specific lcfg-defetc file, anything added by a package script will be overwritten and lost. The big issue is that we do not know what users and groups the various packages might need to add. Previously we have avoided the problem by installing "everything" and then harvesting the passwd and group files to create the templates. It is not clear if this approach will continue to work, particularly as we are no longer intending to install "everything". As a first step, Stephen suggested that we scan all the packages in the SL6 sites mirror to find those with install-time scripts which add users or groups. This should give us some idea of the scale of the problem. An alternative approach would be to ditch the template files entirely and allow most users and groups to be added automatically when required whilst merging any extra users and groups from LCFG resources. It is clear, whatever we decide for SL6, that the LCFG auth component management of the passwd and group files needs to be improved, we will add a mini-project.

Alastair is planning to get a first bootstrap install of SL6 done this week. Currently all the "standard" packages need to be built, he will attempt that using PkgForge.

There is an SL6 project blog which provides more details on the recent activity.

AFS automation project

The various docs for openafs were discussed, it was noted that although the online documentation is old it is still very relevant. Chris asked about getting a copy of the Distributed Services with OpenAFS book. Alastair mentioned that he has an openafs book which is quite useful, he will bring it in for Chris.

Stephen said that he would give Chris a walk through of the LCFG openafs component and how it manages the servers. He can also help set up a simple test cell which should be helpful for learning about the various systems and will be needed for the testing of the new tools. Hopefully this can also lead to the development of more extensive local documentation.

Miscellaneous Development

Practical submission update
Chris has now completed the necessary changes for the practical submission tool. It is installed alongside the old version in the dice-submit package and is now ready for testing. Stephen suggested writing a simple set of scripts for testing the new tool to check it works in the same way as the previous version (except for the intentional changes). This would give us some confidence that it all works correctly and would make future development (including the potential rewrite) much safer and easier. Chris said he needed an SL5 machine for testing, Stephen suggested prague which is no longer needed for pkgforge.

LCFG apacheconf
Stephen has fixed the problem with the LCFG apacheconf nagios translator. When there were multiple virtualhosts on a machine the wrong IP address was being selected for checking. Out of the 38 hosts using the apacheconf component 5 were affected. All the profiles were checked to ensure this change did not have any adverse effects.

LCFG slave tests
Chris ran some test rebuilds of the LCFG config data to see if moving it all into RAM would help. This was done on fantoosh which is an HP DL180 with a RAID controller and fast disks. When it was all done on disk, as normal, the total rebuild took 1 hour 11 minutes. A rebuild done entirely in memory took 1hr 9 minutes. We had expected using memory to give us a big boost, possibly the RAID controller is helping out here? Given that we are writing lots of small files it might be as quick to write into the RAID cache as writing into memory. We need to compare these times with those from the current LCFG slave servers. Stephen needs to reboot them anyway so he will clear the caches and force a total rebuild at the same time, he will do that late one evening.

Operational

circle moved
The MPU test server circle has been moved to AT. It has access to a volume on the satablade (which is still in the Forum) for testing fibre-channel.

satablade
Alastair took apart the satablade and reseated everything to try to solve some problems we have seen with it.

AFS cache size change
We have switched to specifying the AFS cache size as a percentage of the partition size. This change uncovered a bug with the LCFG openafs component on the x86_64 platforms related to the lack of the Perl AFS module. Hopefully, next week we will be able to make the switch to the new partition size for new installs.

RAT f13_64 packages
RAT are working their way through the list of missing packages for the f13_64 platform. Iain is holding back from submitting them until they are all done to avoid introducing any conflicts.

Test VMs
We should move all the test VMs to the KB servers to reduce load and disk usage on the Forum servers. If we need the KB servers for DR purposes then we can shut down the test VMs. We will do this as we upgrade the VMs to SL6. We should advertise this "service" to the other COs.

multipath config
Thanks to the vastly improved multipath documentation on SL5.5 Stephen discovered how to silence an annoying error message about /bin/true which came up at boot-time. The prio_callout option default is now set to none instead of /bin/true

Next Meeting

Due to various holidays, the next MPU meeting will be held on Wednesday 16th March at 10am. If there is a technical talk that day then we will have to move it to the afternoon.

This Week

  • Alastair
    • SL6 project
      • build standard packages in lcfg_sl6_lcfg.rpms
      • finish preinstall steps (eg installbase)
      • attempt install
    • arrange for latest LCFG install ISOs to be stored on DR server

  • Chris
    • practical submit project
    • Look at nvidia drivers and RT#52016 (on kelvin or ord)

  • Stephen
    • PkgForge docs
    • SL6
      • tidy base perl modules list
      • check LCFG headers for F13 macros
      • scan SL6 packages for useradd in scripts (for building lcfg-defetc)

-- AlastairScobie - 01 Mar 2011

Topic revision: r4 - 09 Mar 2011 - 16:54:51 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies