Account Tidying: AFS/homepages/groupspace

Project description

Details of this project can be found here:

Related to:

From project 349:

For accounts we want to delete completely, we need to consider the following:

Files owned by that user outwith home directories - e.g. group space, home pages. If there are other places, we should note them explicitly. ACLs including these users.

Initial thoughts


The nature of group space means that, when a user leaves you can't just delete the group space they requested/own, as others may still be using it. For this reason the deletion of group space can't be fully automated, but we can try to make a decision based on information we hold.

An AFS group space audit was carried out towards the end of last year, so this data could be used to figure out which AFS group spaced we might want to delete when users leave. One of the first things to consider is how do we keep this information accurate and up-to-date?

There are around 850 group space volumes, and we have a record of who the "owner" of each volume is, so the obvious thing to do would be to periodically (annually?) email the owner, asking them to check that the information is correct, and notify us if the space can be deleted/archived, or any of the information held is inaccurate.

It would be nice to automate this process.

My first thoughts are that sharepoint could facilitate this. The AFS audit, currently held in a Excel spreadsheet, can easily be made into a sharepoint list. From there a flow could be created that would annually email the "owner", with a link to the list. When they visit the list page they would be presented only with the rows containing groups owned by them, and they would have permission to update specific fields- whether the information is correct, whether the space is still needed, and if not whether the data should be archived or deleted. A form could be created using powerapps to make this easier whoever is making the changes.

When it comes to creating new AFS space, at the moment all tickets are passed to the Services Unit and we create the space and update the spreadsheet with the details. This is partly because the audit was ongoing, and because we were bringing a lot of new AFS space online, so it wasn't obvious what AFS partitions were safe to use. Going forward AFS group space creation could be passed back to fontline support, and new volumes could be added to the sharepoint list when they're created.

sweb and NFS space could also be added to this list.

Things to consider- Is sharepoint capable of taking in data from outside of sharepoint e.g a list of accounts that are about to be deleted so that the owner can be notified?

As things stand, there are a lot of AFS groups that don't have an active account associated with them. At what point should the data be deleted? Do we ever archive group space and if so, what would the process be for this?


Previously (circa 2013) homepages have been archived in /afs/, more recently we've used a script to automatically delete the homepages associated with old accounts that were empty, but any that contained content need to be manually deleted. It seems likely that in the future the existing script will be used to simply delete all old homepages automatically.


With regard to removing a user from ACLs, we have a script that trawls AFS group space and collects ACL info, but it's run manually only when needed, and takes a long time to complete. A better option might be to run the command volscan (/usr/afs/bin/volscan -type rw -find mount acl) on each AFS server regularly and aggregate the results. When a user needs to be removed, a search of the volscan results would show which ACLs need to be updated. We would assume that we would remove old users from ACLs, but what to do with their files needs to be decided.

Dev Meeting 2020-01-15

A Project Starting talk was given.
Trickier to establish whether to delete group space. Created a s/s on
sharepoint with all 850 groups and owner/contact details, would be pain to
maintain ourselves so ideally get users to do this. Could automate within
sharepoint, example shown as AFS Audit List. Can use flows based around a
check date column triggered annually which emails users to ask them to
update details. Users will just see their own groups. User changes can
trigger emails to us as well, such as when group space is no longer needed.
We should tie this in with the account closure email, or use data directly
from Prometheus. Alternatively could also use PIP to achieve much the same
thing, would be easier for users and would probably integrate better with
local scripts and Prometheus. Ross to discuss with Tim. The script for
creating group space could also add new entries to a database of group
space. Also need to handle homepages - however these are already automated
to an extent (by email notification), there is a pre-zap stage which moves
the pages so they are not served (we could automate this). Need to change
to no grace entitlement so home pages are removed (or moved out of the way)
immediately - AS will check with web strategy. Also need to do things like
cluster data. At the moment this is similarly semi automated using reports,
in this case ownership is changed to root automatically on entitlement
Finally ACL data - we can use volscan on server to create lists of ACLs for
groups - we could use this in order to help manage the process of removing
people from ACLs once their account has gone. Some of these changes don't
need to happen immediately, process could queue and batch for example.


AFS space/quota

An audit (sharepoint) of AFS group space was carried out. Of the 852 group volumes (allocated 116TB of quota), 198 (10.7TB) of them weren't accessible by current any DICE account. These volumes haven't been deleted yet, but access was completely revoked by removing references to old accounts from the group's ACL. The date that this happened was noted in the spreadsheet.

Many old, static and not-paid-for volumes had a much larger quota than was necessary. By reducing this unused quota, 2.17TB of space was made available. Again, the date of this change was made is noted in the spreadsheet.



There are currently 4855 homepages directories, of which 1068 have no primary roles, but only two of these are "post-grace". This is because we already run a script /disk/homewikipages/scripts/checkhomepages 3 times a day that checks whether any web/cgi directory needs to be created or removed. If a directory belongs to an account that no longer has the homepages/html capability, and it's empty, then it's deleted automatically. If the directory contains data, then it's manually copied to an archive, where it is deleted after a couple of months.

Update 11/5/2020

volscan is now running daily on all the AFS servers that hold RW volumes. This logs where in the file system each volume is mounted, as well as all ACLs. I've created a script that searches the logs for a given string. The next step is to use the logs to find ACLs that list old accounts and remove them .

I've written a AFS group space creation script. As well as creating the space it logs the details in a postgres db. The next step is to refine the script, get all the information gathered from the space audit into the postgres db. I need to investigate integrating with PIP.

-- RossArmstrong - 18 Feb 2020

Topic revision: r5 - 11 May 2020 - 08:26:27 - RossArmstrong
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies