MPU Meeting Friday 18th November 2011
AFS automation project
Nothing happened.
LCFG Server Refactoring
Nothing happened.
Install Scripts
Go for sign-off.
Wake-On-Lan
There is now a
wake.inf web service via which users can wake machines their allocated machines. Some packaging of scripts is required before it is completed. Chris demonstrated waking up a desktop machine named
kempt
Simple KVM Service
Lots more documentation has been added, it's all at
SimpleKVMDocs on the DICE wiki. There is now an alpha service running on
circle, Graham has played with it and is very pleased with the performance. It is clear that using LVM volumes is significantly faster than storing the VMs in an ext3 filesystem. Any further deployment remains blocked waiting for a fix for the Fibre Channel. Chris and Stephen should try following the instructions to create new VMs.
Server Upgrades
Nothing happened.
Project Planning
We need to get project proposals for 2012 T1 submitted by Monday 28th November. We will hold an MPU planning meeting next Wednesday at 2pm.
Miscellaneous Development
- hwmon
- This is now quieter and generates much less rootmail.
- LCFG mock
- Stephen fixed a few bugs in the LCFG mock component, particularly he added support for changing the configuration file template using a resource. This will be needed to support the AFS build server.
- LCFGRUN
- Stephen modified the LCFG build tools and the ngeneric and sysinfo components so that there is a new file-system location macro (and associated sysinfo resource) for LCFG component
.run
files. Currently this is the same as it has always been (i.e. /var/lcfg/tmp
) but at the next platform update this will change to a separate location.
Operational
- Server compromise
- Stephen is still working on finishing off the report into the compromise of the SSH servers.
- Fibre-channel
- It looks like power-cycling the SAN storage has not fixed the problems for SL6 fibre-channel. It might be related to Registered State Change Notification (RSCN). qlogic fibre switches have a feature known as "I/O StreamGuard" which can be used to manage RSCN, it is set on a per-port basis, currently all ports are set to the default value. It appears that this should be disabled for targets and enabled for clients. The switches should get it right with the default setting but not clear that this works so trying explicit settings. We need to plan downtime for the KB satabeast so we can change the settings (just in case something goes wrong). We could also use zoning to split out the satabeast SAN storage and their clients as the RSCN does not cross zone boundaries.
- student.ssh
- Stephen is preparing to move the student SSH service to dunlin which is a Dell PE860 inherited from the Infrastructure Unit. It is now located in the AT server room and has been configured and installed with SL6_64.
This Week
- Alastair
- Arrange for someone to take figgy to KB for SAN testing. Check with Craig first to see when satabeast free for SAN testing.
- Chase Craig re scheduling RSCN change for satabeast at kb
-
Consider projects for T1
-
Investigate reboots at install time - do we have a reboot after the big RPM run - check for differences between SL5, SL6 and F13 - could be the cause of DNS problems - SL6 does reboot after the big RPM run
- Pass BIOS settings to USU
- Consider focus for perl learning
- Finish KVM documentation
and macros for bridge creation
-
Publish redhat-autoswap algorithm on LCFG wiki
- Investigate updaterpms timeout issues (wrt AFS hangs)
- Finish work on installroot re multiple interfaces and timeouts (calling udhcpc correctly)
- Chris
- Try kvm service on circle
- Consider projects for T1
- Finish WakeOnLAN project
- Stephen
- Try kvm service on circle
- Consider projects for T1
- Finish security report
- LCFG server deployment
- Student ssh server (dunlin)
- Ensure ssh services are available on guest wireless networks
--
AlastairScobie - 18 Nov 2011