Final Report for Project 358 - SL7 Server Upgrade Project - Services Unit

Introduction

This project was an effort bucket for the work done in migrating the servers managed by the Services Unit to SL7. The master ticket covering the services managed by the Unit and their dependencies can be found here. All told, the project took a little over two years to complete.

Scope of Task

The services identified as belonging to the unit and the number of individual machines providing those services was as follows:

Service No. of machines providing service Notes
OpenAFS 19 (1)
Mail 1  
Plone 1  
Wordpress 12  
Samba 1  
Bugzilla 1  
CVS/Subversion 1  
Git/Gerrit 1  
AFS Git 1  
Groups.inf 1  
iFile 1  
Jabber 1  
Media 2  
Password Portal 1  
Print Server 3  
Misc RFE 1  
Room Booking System 1  
TiBS Backup Server 1  
Twiki Server 2  
Virtual Mail Relay 1 (2)
Web Servers 17  

Total Services: 21 Total server: 70

Notes

  • (1) Includes database servers
  • (2) Also Authenticated SMTP server

Discussion

Some of these are not as daunting as they appear. Although AFS servers are the largest category of service, one AFS server is pretty much like another and though the migration of data between servers to avoid downtime for the end users is time-consuming, it can be done without reference to the end user, and in the background, allowing the upgrade of servers to proceed methodically. Conversely, the next two largest categories, Web and Wordpress are just as bad as they appear, consisting of servers whose configuration varies from machine to machine, and which often host multiple sites requiring sustained negotiation with many end users to identify suitable times for upgrades to take place. It's no surprise that the migrations of these categories were the last to be completed.

One great help in the migration process is that most of the Unit servers are virtual machines. It is therefore possible in many cases to bring up a SL7 version of the service in parallel minimising the downtime required for the upgrade.

Time taken

This project ran for just over two years. The effort recorded was as follows:

PeriodSorted ascending Effort (Weeks)
T1 2016 1.7
T1 2017 9.3
T1 2018 0.2
T2 2016 5
T2 2017 12
T3 2016 9.2
T3 2017 4.0
Total 41.4

The loss of one member of the Unit during this period undoubtedly contributed to increasing the overall length of the project since servers allocated to that person had to be redistrubuted amongst the remaining members of the Unit. The main reason for the for the project extending past its original deadline however was that the onus on on migrating the web services was placed on one persons shoulders. In future OS moves, care will have to be taken to ensure that this task is more evenly spread across the available resources.

-- CraigStrachan - 16 Aug 2018

Topic revision: r1 - 16 Aug 2018 - 09:45:25 - CraigStrachan
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies