Development Project 187 - Prometheus: migration to new database feeds - Report

Initial Description (from devproj)

The school database will be changing the feeds it gives us. There'll be some Prometheus work to cope. Initially we will be getting a 3G feed of new and current PGR cohort. Note 21/01: this work is currently in progress (as driven by rollout on 10/01/11).

Initial Plan (from devproj)

  • Write new Prometheus store/conduit for new feed.
  • Testing/data integrity checking of conduit.
  • Roll out to live server
  • RAT to provide combined feed of 2G/3G data
  • Adapt conduits for new feed
  • Testing of new feed
  • Rollout

Development

Initial Development

The RAT unit provided us with new database views for those people who had been moved to Theon (new and current students). The views provided were Prometheus_user_3g, Prometheus_role_3g and Prometheus_email_3g. The data provided is similar, but not the same, as the data contained in the old 2G views. The 3G feeds introduce the provision of a UUID identifier, which Prometheus now uses. 3G people can also have more than one email address (one is marked as the 'default').

A Prometheus datastore and conduit were written for the new feed and put into testing quite early. This code was quite similar to the code already in existence for the 2G feeds (and indeed was subsequently refactored to share more code).

Once it was decided that RAT weren't going to provide a combined feed (as was suggested in the original plan), but were instead going to gradually move groups of people from 2G to 3G, we needed a way of managing the behaviour of the Prometheus conduits so that they could handle potentially overlapping bits of data (e.g. a user in both 2G and 3G feeds). The relationship between the two Prometheus conduits was this (taken from Prometheus documentation; note that 'Olympus' is the name of the LDAP directory within Prometheus):

  1. If the infdb conduit sees an entry for which it doesn't 'own' the corresponding object in Olympus, it will ignore that entry.
  2. If the infdb3G conduit sees an entry which is owned by the infdb conduit, it will gazump ownership (i.e. remove the infdb ownership and add its own).

Note that management of this issue was greatly helped by the ownership model designed into Prometheus (see https://wiki.inf.ed.ac.uk/view/DICE/PrometheusDesignDetail#Ownership for more information).

Speed/Caching

It became apparent during the testing process, that the 3G conduit was significantly slower than the 2G one. This was due to the increased complexity of the database logic providing the roles information and was exacerbated by the conduit querying the database for roles and emails for every user. This was fixed by adding a 'caching' attribute to the 3G conduit which, when enabled, fetches all role and email information initially and stores in memory for the duration of the conduit (there is a configurable timeout, after which it will be re-fetched).

UUIDs

As mentioned earlier, the 3G feed provides us directly with UUIDs from the database. UUIDs are used as the primary identifier for entities within Prometheus. The original infdb conduit generates a new UUID for each new entity it adds, whereas the 3G feed provides UUIDs directly.

Accordingly, a decision we had to make was whether Prometheus's infdb3G store should use the UUID as the primary key for the store. We initially implemented the 3G store using the username as the primary key, in order to minimise changes (as both Prometheus and database records can be directly accessed using either UUID or username, it makes litle difference practically). The one thing keying on UUID gives us is the ability for a user to change usernames (e.g. a student becoming a member of staff) without appearing as a different user to Prometheus. This is beneficial enough (particularly when factoring in our need for account lifecycle management) that the change to using UUID as the primary key was made.

Code Tests

The Prometheus code includes a comprehensive test suite. The database part of this is implemented by bringing up and taking down a small SQLite database, to simulate the school database. Unfortunately we cannot write a full set of tests for the 3G database because of the use of a field named 'default' in prometheus_email_3g - this is a reserved word in SQLite. This is one of the issues to be addressed in the future. Note that all code changes introduced in this project were heavily tested using a test server and the actual data (see below).

Testing

Quite a lot of time was spent testing any code changes before they were implemented. The infrastructure unit maintain a test server for Prometheus for such purposes. Although obvious, it's worth reiterating the importance of time spent on this testing - the interface between the school database and Prometheus is where account management obtains most of its user data.

Role Management

The move to 3G brought in changes to the database-generated roles. These were described here: https://wiki.inf.ed.ac.uk/DICE/3GRolesChanges

Planning for the roles changes was quite time-consuming, as it involved many people. Initially RAT had to check the list of roles for all students on the 3G feed and compare them to their corresponding roles on the 2G feed for correctness.

Role management issues highlight the importance of (and need for): https://devproj.inf.ed.ac.uk/project/show/197

Effort

The project took 137.25 hours in total (nearly 4 weeks).

Issues

Other than the issues described above, a number of other matters concerning the relationship between the school database and account management have arisen over the course of this project (and Prometheus in general). These are outwith the scope of this project and are being addressed initially in a meeting scheduled between RAT, Infrastructure, Support and other interested parties.

-- TobyBlake - 20 Mar 2012

Topic revision: r1 - 20 Mar 2012 - 15:57:22 - TobyBlake
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies