PGT
This is the
interim process for handling Taught Postgraduate Post Application Communication. This is driven from applicant information held in EUCLID and some locally maintained information. The process is entirely automatic. The source is a PGT Excel report generated in BOXI from the EUCLID Universe and received locally as an attachment in an email. The report is generated and sent out every weekday morning using a recurrent BOXI schedule. The local synchronization is carried out by a
.procmailrc
triggered Python script for the
daidb
user on the database server.
The PGT structures in the local database consist of five tables.
-
pgtdata
- the raw data from EUCLID (a representation of the Excel file content held in the database).
-
pgt_person
, pgt_course
, pgt_email
- simplistic representation of the EUCLID Universe structure held in sync with the pgtdata
content (and hence EUCLID upstream data), these form the read-only local cache of EUCLID PGT application data. Note that each applicant can be applying for multiple courses and may register multiple email addresses hence the split of this information into separate tables.
-
pgt_ie
- locally maintained information (some by the ITO and some automatically by the mailshot process) for the PGT process complementing the EUCLID data (local records are tied to corresponding EUCLID records by the student code). Also used to track whether and when email has been sent to applicants.
See the
EUCLIDiE Cluster in the normal database service data model pages for more details. The
pgt_person
,
pgt_course
,
pgt_email
and
pgt_ie
tables are all accessable and maintainable by the ITO via the
PGT
custom form in the normal graphical user interface.
PGT Sync Process
The synchronization process works as follows.
- A PGT report in Excel format is sent as an attachment in email to
infdb-upload@inf.ed.ac.uk
(aka daidb@inf.ed.ac.uk
) from BOXI every week day morning at 815AM. This process is run out of a specific user account at present.
- The
daidb
users .procmailrc
filter on the database server picks up this report by a subject line match and pipes it to the pgt_upload
script in the conduits/etc
directory.
- The
pgt_upload
script:
- saves a copy of the received email message as a file called
BOX_ITO_PGT_YYYYMMDDHHMM.msg
in the conduits/holding
directory (where YYYYMMDDHHMM is the current day and time).
- unpacks the attachment and saves the Excel document as a file called
BOX_ITO_PGT_YYYYMMDDHHMM.xls
in the conduits/holding
directory.
- converts the Excel document to an Ingres load file (ascii text percent delimited) for very fast direct table load in a file called
BOX_ITO_PGT_YYYYMMDDHHMM.txt
in the conduits/holding
directory
- generates an SQL file to load the
pgtdata
table from the content of this load file in a file called BOX_ITO_PGT_YYYYMMDDHHMM.sql
.
- runs the
sql infdb
command on BOX_ITO_PGT_YYYYMMDDHHMM.sql
to load the pgtdata
table. This completely wipes the old content of the pgtdata
table and replaces it with the live current content in the Excel file.
- runs the conduit
pgt_sync
which works out the differences between the new content of the pgtdata
table and the old content in the pgt_person
, pgt_course
and pgt_email
tables and produces a set of SQL statements to bring these tables into line in the file pgt_sync.sql
in the conduits/outputs
directory. In detail the pgt_sync
conduit runs queries to compare the content of the pgtdata
table and the pgt_person
, pgt_course
and pgt_email
tables and creates some SQL to bring the latter in line with the former. It will print output saying what records (applicants, courses and email addresses) are being created, updated or deleted. Note that no records are actually deleted from the local copies - if an application once appeared in the download and now no longer does it is simply marked locally as deleted with a date. The only case where an application will disappear from the download is when the applicant cancels the application themselves, hence it should not be frequent and more than a few deletes should be cause for concern.
- runs the
sql infdb
command on the pgt_sync.sql
file to make the table updates, bringing the local cache copy uptodate with the upstream status. Note that it is only at this step that the actual PGT local copies of EUCLID Universe structures are affected.
- runs the conduit
pgt_sync_report
which queries for all the changes made by the sync process and produces a text report in the file pgt_sync_report.txt
in conduits/outputs
.
- emails the
pgt_sync_report.txt
file to the ITO request tracker (only if non-zero content).
- output and error from all the above is logged in the file
BOX_ITO_PGT_YYYYMMDDHHMM.log
in the conduits/holding
directory.
The whole process only takes a few seconds.
To run the process manually the simplest approach is to email a compatible Excel document as an attachment to
infdb-upload
with the correct subject line (see the
.procmailrc
). Alternatively if the original report was received but the process broke locally at a later point, the received email message will be held in the file called
BOX_ITO_PGT_YYYYMMDDHHMM.msg
in the
conduits/holding
directory. This can simply be piped directly into the upload script as the
daidb
user, for example
cd ~/conduits/etc
and then
cat ../holding/BOX_ITO_PGT_YYYYMMDDHHMM.msg | ./pgt_upload
, to allow manual testing and debugging. Note that the warnings about file size and OLE2 inconsistency seem when running manually are normal and do not break the process.
PGT Send Process
This step is carried out by the ITO via
genrep
. They run report
C715
from the
EUCLIDiE
menu (
0
) to generate a CSV file for all the applicants from the stored data (note that each applicant may appear more than once in this file when they have more than one email address or are applying for more than one course - only applicants with at least one email address will be included however). Then they run report
M715
from the same menu which processes the output from
C715
and sends the email messages. This has a couple of options - a debug option means the email messages will all be sent to the user running
genrep
rather than the actual recipient and this can be used to check the content as generated individually per-applicant. The second option disables sending emails although everything is processed as normal and output to screen - this can be used to check who is being sent email.
A confirmation email is sent to the user running
genrep
with a CSV file of who the email was sent to. This email is also sent to the
daidb
user where it is automatically processed via a
.procmailrc
entry. This pipes the mail through the
PGTsync
script in the
conduits/etc
directory which processes the CSV to produce and run SQL to update the sent date for letter type 1 for each applicant in the
pgt_ie
table to be the current days date (on subsequent runs this ensures letter type 1 is only sent to new applicants).
--
TimColles - 13 Feb 2009