MPU Meeting Tuesday 28th July 2009
Power Management Project
Not much has happened recently. Now that SL5.3 is installed on most of the student lab machines Chris will be able to start doing more widespread testing. The new HP DC7900 needs to be tested.
rpmsubmit
Alastair has realised that the packages need to be accessed in other ways than via the filesystem and over http with help of the Apache mod_waklog. In particular we need to give rsync access to the package repository to IS (and possibly SEE). For now this will be done with an IP based ACL for the machine running the rsync daemon. There will be some changes to the repository layout and we need to raise this issue with Kenny at the next LCFG Deployers Meeting.
We need to find out if any external access via http to the rpm master is still required.
Alastair has started writing up some notes regarding the new package repository in the DICE wiki at
MPUPackageRepository.
AFS Component
The test fileserver,
mermaid, has been configured using the new component. Simon has been using the machine for some performance testing.
TIBs
Chris has repackaged the tibs RPM so it owns all the files it installs. He had some problems with it overwriting configuration and state files managed by the LCFG component. Stephen suggested using the
%config(noreplace)
directive in the
%files
section of the specfile.
Chris has been
blogging about his progress.
LCFG Server Refactoring
Not much has happened with the development of the test suite. Simon has setup
git and gerrit so they are ready for the LCFG server code and local dependencies to be imported.
Miscellaneous Development
- SL5.3
- There were a few problems with the upgrade to SL5.3, mostly related to those machines which had to be held on SL5.2. There were some package conflicts related to an update of openssh and the kernel. There was also a problem with drookit but it's not clear exactly what happened, Stephen needs to do some further investigation.
- lcfg-om
- Stephen found some problems with the new
LCFG::Om::Environment::NewAFSPAG
he was testing on his desktop. It didn't cope with situations in which the AFS client was not yet started, such as at boot-time.
Operational
- lcfg-cron
- The problem with the timestamp not being updated in the manual method was fixed. A separate problem with the way that the
AUTO
random time was seeded from the hostname was also resolved, the fix for that still needs to be rolled out.
- nfslock
- Stephen has now changed the boot priority so that this starts after the LCFG client and that has gone into the stable release.
- VMWare performance issues
- We experienced some terrible performance from the VMWare ESX server during the SL5.3 upgrade of the hosted servers. It appears that the evo arrays are just not fast enough, some of this may be caused by them not having the optimal configuration.
- New time category
- It was agreed that we would have a new time-monitoring category for work associated with requests from users. Alastair will add a note to the MPUTimeMonitoring wiki page.
- Problem with ls
- Julian Bradfield has found a bug in the updated version of
ls
in SL5.3, see RT#43055 for details.
- Server moves
- Chris and Carol have finished moving all the MPU servers which do not need to be in a rack with fibre-channel access.
- updaterpms lockups
- We need to look at why
updaterpms
is not being run for long periods on some machines. Possibly we could add some nagios monitoring to check the last time the machine successfully carried out a software update. That information is in the LCFG server status pages.
- multipath & nagios
- This is now ready to go, Alastair will announce to COs.
- cosign v3
- Stephen found some problems with the LCFG server apache config and the way it used a
block which conflicted with cosign v3 and the new validation block.
- diydice move
- To simplify the configuration of dresden we plan to move diydice to a new server. Chris will do that this week.
- Forum tracker
- We need to move the Forum Tracker RT service to cosign v3.
- Install CD default
- We should change the default drive from
hdc
to sr0
This Week
Alastair will:
- Talk to Carol about the LCFG level work.
- Remind Craig about moving hawthorn to KB.
- Work on rpmsubmit project.
- Package scli
- rpmsubmit
- Tidy LCFG/inf level
Carol will:
- Set up LCFG/inf level VM to monitor LCFG level.
Chris will:
- Move diydice to a new virtual server
- TIBS component
- Forum tracker and cosign v3
- HP DC7900 sleep testing
- sleep in student labs
Stephen will:
- LCFG server refactoring
- Update PXE installer to SL5.3
- Push out lcfg-cron update
--
StephenQuinney - 31 Jul 2009
Topic revision: r3 - 05 Aug 2009 - 14:34:50 -
ChrisCooke