PGT

This is the interim process for handling Taught Postgraduate Post Application Communication. This is driven from applicant information held in EUCLID and some locally maintained information. The process is entirely automatic. The source is a PGT Excel report generated in BOXI from the EUCLID Universe and received locally as an attachment in an email. The report is generated and sent out every weekday morning using a recurrent BOXI schedule. The local synchronization is carried out by a .procmailrc triggered Python script for the daidb user on the database server.

The PGT structures in the local database consist of five tables.

  • pgtdata - the raw data from EUCLID (a representation of the Excel file content held in the database).
  • pgt_person, pgt_course, pgt_email - simplistic representation of the EUCLID Universe structure held in sync with the pgtdata content (and hence EUCLID upstream data), these form the read-only local cache of EUCLID PGT application data. Note that each applicant can be applying for multiple courses and may register multiple email addresses hence the split of this information into separate tables.
  • pgt_ie - locally maintained information (some by the ITO and some automatically by the mailshot process) for the PGT process complementing the EUCLID data (local records are tied to corresponding EUCLID records by the student code). Also used to track whether and when email has been sent to applicants.

See the EUCLIDiE Cluster in the normal database service data model pages for more details. The pgt_person, pgt_course, pgt_email and pgt_ie tables are all accessable and maintainable by the ITO via the PGT custom form in the normal graphical user interface.

PGT Sync Process

The synchronization process works as follows.

  • A PGT report in Excel format is sent as an attachment in email to infdb-upload@inf.ed.ac.uk (aka daidb@inf.ed.ac.uk) from BOXI every week day morning at 815AM. This process is run out of a specific user account at present.
  • The daidb users .procmailrc filter on the database server picks up this report by a subject line match and pipes it to the pgt_upload script in the conduits/etc directory.
  • The pgt_upload script:
    • saves a copy of the received email message as a file called BOX_ITO_PGT_YYYYMMDDHHMM.msg in the conduits/holding directory (where YYYYMMDDHHMM is the current day and time).
    • unpacks the attachment and saves the Excel document as a file called BOX_ITO_PGT_YYYYMMDDHHMM.xls in the conduits/holding directory.
    • converts the Excel document to an Ingres load file (ascii text percent delimited) for very fast direct table load in a file called BOX_ITO_PGT_YYYYMMDDHHMM.txt in the conduits/holding directory
    • generates an SQL file to load the pgtdata table from the content of this load file in a file called BOX_ITO_PGT_YYYYMMDDHHMM.sql.
    • runs the sql infdb command on BOX_ITO_PGT_YYYYMMDDHHMM.sql to load the pgtdata table. This completely wipes the old content of the pgtdata table and replaces it with the live current content in the Excel file.
    • runs the conduit pgt_sync which works out the differences between the new content of the pgtdata table and the old content in the pgt_person, pgt_course and pgt_email tables and produces a set of SQL statements to bring these tables into line in the file pgt_sync.sql in the conduits/outputs directory. In detail the pgt_sync conduit runs queries to compare the content of the pgtdata table and the pgt_person, pgt_course and pgt_email tables and creates some SQL to bring the latter in line with the former. It will print output saying what records (applicants, courses and email addresses) are being created, updated or deleted. Note that no records are actually deleted from the local copies - if an application once appeared in the download and now no longer does it is simply marked locally as deleted with a date. The only case where an application will disappear from the download is when the applicant cancels the application themselves, hence it should not be frequent and more than a few deletes should be cause for concern.
    • runs the sql infdb command on the pgt_sync.sql file to make the table updates, bringing the local cache copy uptodate with the upstream status. Note that it is only at this step that the actual PGT local copies of EUCLID Universe structures are affected.
    • runs the conduit pgt_sync_report which queries for all the changes made by the sync process and produces a text report in the file pgt_sync_report.txt in conduits/outputs.
    • emails the pgt_sync_report.txt file to the ITO request tracker (only if non-zero content).
    • output and error from all the above is logged in the file BOX_ITO_PGT_YYYYMMDDHHMM.log in the conduits/holding directory.

The whole process only takes a few seconds.

To run the process manually the simplest approach is to email a compatible Excel document as an attachment to infdb-upload with the correct subject line (see the .procmailrc). Alternatively if the original report was received but the process broke locally at a later point, the received email message will be held in the file called BOX_ITO_PGT_YYYYMMDDHHMM.msg in the conduits/holding directory. This can simply be piped directly into the upload script as the daidb user, for example cd ~/conduits/etc and then cat ../holding/BOX_ITO_PGT_YYYYMMDDHHMM.msg | ./pgt_upload, to allow manual testing and debugging. Note that the warnings about file size and OLE2 inconsistency seem when running manually are normal and do not break the process.

PGT Send Process

This step is carried out by the ITO via genrep. They run report C715 from the EUCLIDiE menu (0) to generate a CSV file for all the applicants from the stored data (note that each applicant may appear more than once in this file when they have more than one email address or are applying for more than one course - only applicants with at least one email address will be included however). Then they run report M715 from the same menu which processes the output from C715 and sends the email messages. This has a couple of options - a debug option means the email messages will all be sent to the user running genrep rather than the actual recipient and this can be used to check the content as generated individually per-applicant. The second option disables sending emails although everything is processed as normal and output to screen - this can be used to check who is being sent email.

A confirmation email is sent to the user running genrep with a CSV file of who the email was sent to. This email is also sent to the daidb user where it is automatically processed via a .procmailrc entry. This pipes the mail through the PGTsync script in the conduits/etc directory which processes the CSV to produce and run SQL to update the sent date for letter type 1 for each applicant in the pgt_ie table to be the current days date (on subsequent runs this ensures letter type 1 is only sent to new applicants).

-- TimColles - 13 Feb 2009

Topic revision: r4 - 30 Apr 2009 - 14:02:40 - TimColles
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies