Final Project Report - Project 2 - AFS file system

Introduction

The introduction of a new secured distributed file system to replace NFSv3 in the DICE infrastructure is one of the most significant projects the computing staff of the School of Informatics has undertaken since the introduction of DICE itself. Given that the original paper reviewing the case for replacing NFSv3 and surveying the possible replacements was published on the 1st of February 2005, it also ranks as one of the longest running computing projects to date.

Effort

Work on this project began well before the restructuring of computing staff into Units and the consequential recording of effort spent on projects. However, it is possible to say that since the reporting of development time figures began in Q2 2007, at least 940 hours of effort have been spent on the project. Given that this figure does not include the first two years of the project when much of the investigative and initial development effort took place, and that effort spent by members of the User Support unit in moving data to AFS has only been reported for the last year or so, the true figure is certainly far higher. It should be noted that though this project has been managed within the Services Unit, it has required contributions from members of every Unit. Particular appreciation should be expressed to the CSOs who have laboured long and hard on the task of moving existing data across to the new file system. It is entirely due to their efforts that this project has not take even longer to complete.

Duration

It may be asked just why this project has taken so long. Rather than being due to technical issues (of which there have been remarkably few) this is a consequence of the approach taken to migrating existing users to the new file system. Deciding to not migrate existing undergraduate accounts to the new file system meant that the transition period would inevitably stretch over a minimum of 4 years. In addition, the decision was taken to migrate existing staff and postgrad home directories to the new file system rather than just providing a new empty AFS home directory for each user and leaving them to move their data across themselves. This increased greatly the effort required to move the user base to the new file system but made the transition much less disruptive for the individual user, contributing to the relative smoothness of the overall process. The effort spent in identifying issues which might affect the users such as the introduction of permission management via ACLS and in notifying users who would be potentially affected by these changes was also time well spent.

Achievements

We are now at the stage where the vast majority of commodity data within the School is served from a secure and (despite recent events which were not directly related to OpenAFS) reliable file system which allows the School's users to access their data from any machine with a network connection, anywhere in the world offering a facility to our users which few other schools (or indeed universities) can match.

Criticisms

It was realised at an early stage in the project that user education would be an important issue and some effort was spent in producing introductory documentation tailored to the local environment which met the needs of the majority of local users. Much effort was also spent in identifying potential issues with the new file system such as long running jobs and producing technical solutions to these problems. These solutions were, by and large, of a technical nature themselves and where we could have done better is in supplying support and documentation to those users struggling to cope with concepts such as Kerberos ticket life which had previously been hidden from them. We might also have made more effort to produce solutions which were tailored to a less technical savvy user base. Both of these issues are now being addressed within the remits of other development projects.

Future developments

There is a great deal of scope for automating the management of the OpenAFS file system and reducing the effort spent on such things as load balancing and maximising disk space use. These matters will be investigated in a future development project.

-- CraigStrachan - 31 Mar 2010

Topic revision: r1 - 31 Mar 2010 - 15:22:56 - CraigStrachan
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies