Project 268 - AFS on ECDF/Datastore


The purpose of this project was to investigate the possibility of using AFS to access the 500GB of datastore file-space allocated to each member of active research staff.

Though this is technically possible, and has been demonstrated working, my proposal is that we no longer pursue this goal, and instead try to use the datastore service as is.


To meet goal 2 of the 2013 computing plan - "Continued consideration of appropriate use of central data storage facilities, specifically investigate AFS over Eddie storage" - this project was born.

As time has moved on, it is now goal 1 of the 2016 computing plan, and "Eddie storage" is now "datastore".

The datastore service provides access to its space via NFS, CIFS or SSHFS, it was felt that being able to access it via AFS would be more beneficial for our staff, as it would integrate more seamlessly with our existing AFS filing system.

Options considered

Essentially there were several of terabytes of storage on the datastore hardware destined for use by our staff. There were several options considered on how to make use of that space over AFS.

1. IS provide the AFS service

IS set up a datastore AFS cell, and via cross realm trust, we'd just access the space via a path like /afs/ There'd be a significant learning curve required by IS to set up and manage the service, but we'd offer to help.

2. We access the underlying GPFS storage

The underlying storage datastore uses is GPFS, we could obtain the necessary licences and run AFS as we do on our servers. Our machine(s) would be GPFS clients, re-exporting the GPFS space as the usual AFS partitions/volumes.

The main problem with this, is that IS would need root access to our machines (to manage the GPFS), and (assuming the servers were part of our default cell), could then access any of our existing data in AFS. Similarly, as an GPFS client, we would have the ability to access or destroy everyone else's data in their GPFS.

They could potentially set up a separate GPFS cluster, just for us, so we'd be isolated from accidentally trashing other people's data. This would require extra effort on IS' side.

3. We gain raw access to the disk spindles

As a real fall back position, we considered co-locating some of our DICE managed hardware at the datastore location, and just getting raw access to the underlying disks that would notionally host the space our users were entitled to.

The main disadvantage of this, is that we wouldn't benefit from datastore's backups, redundancy or disaster recovery plans. It might also be that our space doesn't fall into a convenient number of disks/arrays. Generally it isn't very scalable/flexible if/when our allocation changes.

AFS on Datastore

As I'd done some initial testing of running AFS on our own local GPFS cluster, the first thing we considered (despite the reservations about root access etc) was to try and access a test datastore GPFS cluster directly on our existing test AFS GPFS server (crocotta). However we quickly ran into problems with our version of GPFS 3.2, and datastore's version 3.5.

While trying to obtain a copy of GPFS 3.5 for our use, I liaised with Jan (my datastore contact) about setting up a VM at their end and creating an AFS file server. I created a new cell here,, and gave the cell keytab to Jan to install on his file server. This cell still authenticated against our INF.ED.AC.UK kerberos realm.

It was only a small trial cell, with one VM here as a AFS DB server, another as an AFS file server, and the other VM file server up at datastore.

neilb> bos listhosts dsone -cell
Cell name is
    Host 1 is
neilb> vos listaddrs -cell
neilb> tokens

Tokens held by the Cache Manager:

User's (AFS ID 26289) tokens for [Expires Apr 30 07:21]
User's (AFS ID 26289) tokens for [Expires Apr 30 07:21]
   --End of list--
neilb> ls /afs/
ascobie/  jwinter/  neilb/  neilb2/

This actually worked quite well, and pretty smoothly.

Though this isn't the scenario we envisaged, ie IS running the datastore cell, if we chose to go down this route, it seemed a reasonable compromise.

More work would be required to turn it into a full service, such as managing the PTS DB, prometheus could do that as it does for our existing cell. We'd also need to come up with some way of creating of managing the space on the IS based server(s). And IS would need to be able to monitor our usage, to make sure we weren't over using our allocation of space.


We were concerned that the performance of the AFS over GPFS and EdLAN could be an issue. I tried bonnie++ and iozone, and recorded some results on the project wiki page, Project268AFSonECDF. Though I was never convinced the results actually told us much, due to differing machines I had to run the tests on.

Also when considering the performance, I was more concerned with the real world performance, ie what a user might experience, rather than a true figure for whatever network read or write might actually result. The local AFS cache tends to be a complicating factor.

So I wrote a simple script that looped through various standard file operations. Created files of various sizes, wrote content to those files, read random blocks from those files and checked they contained the correct data, and then deleted the files. The whole thing then looped hundreds of times to get some averages.

The table of those figures is also on the project page, Project268AFSonECDF#nbperf. This simple test seemed to show that the datastore space, was about 30% slower than our local AFS. But still 3 times faster than our NFS space. This was just using my basic test VMs for datastore, presumably real hardware would make a bit of an improvement. Either way, it doesn't appear that performance wouldn't be much of an issue.


It became apparent, as time passed, that there were would be several disadvantages to eschewing the native datastore service.

  1. The native datastore service provides three mechanisms (CIFS, NFS, SSHFS) to access the datastore space, but they all give a consistent view of your files. Files created and accessed via AFS, would not also be accessible via the existing mechanisms, and vice versa. So if there were cross collaboration between schools, either our researchers would have to use CIFS, NFS, etc anyway, or we'd have to ask the other researchers install AFS clients to access the common data.

  2. Users are already discovering for themselves the IS documented way to access their space eg * and this works for various platforms. As we know, the AFS client on MacOS and Windows is becoming more problematic.

  3. The native datastore can be connected to DataSync , the dropbox-a-like service run by IS. Again something we'd not benefit from if we go for AFS.

  4. The native datastore also provides a snap-shotted file system, so users can go back to various points in time. A bit like our "Yesterday", but more flexible. We would not able to take advantage of this if we go down the AFS route.

  5. Though datastore's disaster recovery would still work for an AFS solution, using their tape backups as an "undelete" would be hard. Their backup software does not know about AFS, so would be backing up the raw partition data. To restore a single file from tape, would require the partition containing the volume restored from tape, and then the required volume mounted (in a non-clashing way) so that the user could access their deleted file. If we were to use AFS on datastore, we'd probably want to try and come up with something a bit more flexible. Maybe even using our own tibs backup, but would we have the tape space?


Though I'm confident we could get AFS working on datastore, and the day to day access of the files would be fine, I believe the various disadvantages outweigh the benefits of proceeding with this project.

Instead we should try to use the IS datastore service as it is. Making it as simple as possible for DICE users to access their space from DICE. eg users need to do something like:

mkdir /tmp/neilb-datastore
sshfs -o intr,large_read,auto_cache,workaround=all -oPort=22222 /tmp/neilb-datastore
# give AD password when prompted

perhaps all we need to do is wrap that up in a local script, and/or speak to IS to see if we can use our existing kerberos credentials to authenticate.

We also get the benefit of all of IS's documentation, support, and expertise on an already bedded-in system.


This dragged on for longer than it should. There's no reason, other than me not prioritising it enough.

T Days
2014 T1 1.06
2014 T2 5.49
2014 T3 1.21
2015 T1 1.34
2015 T2 1.54
2015 T3 0.19
2016 T1 2.57
Total 13.4

-- NeilBrown - 02 May 2016

Topic revision: r2 - 20 Sep 2016 - 10:43:07 - NeilBrown
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies