TWiki
>
DICE Web
>
ResearchAndTeachingUnit
>
LabExamsProc
(revision 84) (raw view)
Edit
Attach
<!-- // vim: set syntax=twiki expandtab tabstop=3 shiftwidth=3 autoindent smartindent: --> <link rel="stylesheet" type="text/css" href="//www.dice.inf.ed.ac.uk/units/rat/twiki-article.css"> <style type="text/css"> pre, code { font-size: 140%; color: #101090; } </style> ---+ Laboratory Examination Procedure Note: Staff involved in supporting a Laboratory Examination should also read ExamProcedures which contains summary advice for invigilators. The definitive *technical* advice is intended to remain in this document but in case of procedural conflict please check with the author(s)! <div class="note"> *New for 2018*: The =who= output has changed again (actually it seems to have reverted to pre-SL7.2 behaviour). =examutils-2.8= has been patched to accommodate this but will require full testing. *Bug (updated 2019)*: There's a bug - possibly resolved - which routinely results in the log host refusing some/all lab connections on lockdown. See [[LabExamsProc#Known_Issues][Known Issues / Loghost rsyslog failure]] for instructions. </div> %TOC% Please make sure to do all of the preparation described below *at least one week before* the exam is scheduled to take place; this allows us to make changes within the DICE release cycle (as well as giving RAT time to fix any issues). Before a mock exam, a real exam or a resit exam runs there are a few things to check/do. ---++ Preparing 'examsubmit' data Students use a command called =examsubmit= to submit their work at the end of an exam. This is a wrapper for the normal =submit= command which sets the =course= and =exercise= (to use the =submit= terminology) under which the exam work will be submitted. This is however different for each course and exam type (mock, real or resit). Taking the =inf1-fp= course exams for example; while the course remains =inf1-fp= the exercise could be one of =mpe= for the mock exam, =pe= for the real exam and =rpe= for the resit exam. The =examsubmit= command is a one-line script created by the file component from the contents of the =examlock.course= and =examlock.paper= resources, prior to the exam. This is done via the =examlockdown-lock= command (see below). In the =examsubmit= directory, the following files are important to the submit process: * =exnamefile=, =modgrp=, =master= (mastered in Theon, by the ITO) * =accnames= (updated by teaching staff with the =set_submit_file_names= command) See the SubmitServiceGuide for full details on updating these files, but note that the submit home directory is different: where the instructions in SubmitServiceGuide refer to the submit home directory, the examsubmit command uses the <em>examsubmit</em> group directory. ---++ Providing Exam Papers Exam papers should be placed in the appropriate directory within the =examsubmit= area (as near to the exam time as is comfortable for added security). Papers and other pieces of reference documentation can be in any reasonable quantity or format, but must be in a =<course>/<exercise>/papers= subdirectory, and readable by submit. User-editable files must be, similarly, in a =<course>/<exercise>/templates= subdirectory, for example: <pre> getpapers/ ├── <coursename> │ ├── pe │ │ ├── papers │ │ │ ├── documentation.html │ │ │ ├── exampaper.pdf │ │ │ └── more.pdf │ │ └── templates │ │ ├── afile.txt │ │ └── editable.txt │ ├── <exercise> │ │ ├── papers │ │ │ └── ... : : : </pre> The =getpapers= script is configured using the =examlockdown= command and will only function during lockdown. See LabExamsGetPapers for full configuration details, or below for details of the =examlockdown= command. Remember to create a matching directory in the =exambackup/= partition, too: <pre> exambackup/ ├── <coursename> │ ├── mpe │ ├── pe : : </pre> without this, *backups will not be created*. A script to check this this is pending. ---++ Locking Down Use the command =examlockdown-lock= *as root* on the machine hosting LCFG profiles (named =lcfg-master=). This takes mandatory =course= and =exercise= parameters, and then one or more additional arguments, each of which can be the name either of a host or a student lab room number. All matching hosts will be switched into lockdown mode. *Be careful*: the command performs a regular expression match on the location, so one character out of place could lock down many more machines than you expect. As of v2.1 the command matches against the whole string (e.g. 4.14 will NOT match hosts in 4.14A); you can use wildcards to broaden matches if required. <div class="note"> If you are unfamiliar with the command it's recommended you test its effect on machine profiles using the =--test= option (running as *yourself*, not root). In this mode a snapshot copy of the profiles is placed in =/tmp/examlockdown.<user>= and modified instead of the real profiles (to clear this test data, you can use the =--clean= option). If you run into errors the command will sometimes ask you to "Run with bash -x to see why". To do this just run =bash -x examlockdown-lock [...]= (with the same arguments afterward) and you'll hopefully see the test that failed (mismatched site or OS for example) in the final few lines of the bash output. </div> For example: <blockquote><b><pre> $ examlockdown-lock inf1-op mpe readonly 5.cl-w 5.cl-n wibble </pre></b></blockquote> would lock down all the machines in =5.cl-w= and =5.cl-n= (according to inventory data) as well as the machine =wibble=. They will be locked to the exam =inf1-op=, exercise =mpe= (i.e. mock exam), and the storage device policy set to "READONLY". <blockquote><b><pre> $ examlockdown-lock --site FH inf1-op pe log 3.* </pre></b></blockquote> would lock down all the machines in site =FH= in rooms starting with =3= (effectively, the whole floor, for machines whose profiles include one of the =studentlabs-fh-*= headers). Note the =--site= flag which overrides the default of =AT=, and the use of the "log" Storage Device Policy (in effect, read-write, in this case). The end result of running the lockdown script is to add or remove the following lines from a selection of machine profiles: <blockquote><pre> #define EXAM_LOCKDOWN 129.215.X.X /* IP of the profile host */ #define EXAM_COURSE crs #define EXAM_PAPER paper #define EXAM_MOUNT_POLICY POLICY </pre></blockquote> The command also performs some sanity checks (does the profile contain an appropriate OS header? Does it contain a "studentlab" header in the site specified by =--site=?). This could be done manually, but would be somewhat tedious and risky — in particular, if a machine locks down with the wrong IP address then it will be completely broken and will need to be recovered by disabling its firewall in single user mode. After a machine receives a changed configuration (if that configuration changes either the <strong>course</strong> or the <strong>paper</strong> it will: * kill all non-system processes * wipe user files from temporary directories * wipe local home directories (if they exist) * restart the display manager ---++++ Your Machine Might Not Be Locked Down :( Lockdown under SL7 =systemd= is not a 100% deterministic process as it was in SL6. What's more, this landscape seems to change between (and within) minor releases. So an undefined 'settling' period seems to be necessary under some (unknown, but unlikely) circumstances. Typically this incomplete state always includes the =$HOME= not having switched from an =/afs= to a =/disk/scratch= path (though AFS is always disabled by this point; this simply manifests as an unwriteable home dir. A log search for ='getpapers.*afs'= will turn up all such failures). In practice, doing one of the following will prevent all such failures before they impact on students: * reboot * wait (no more than 5 minutes) once login screen changes. That's it. Checking the logs shortly after exam start will turn up any that have been missed. ---++++ Once locked down... Once in lockdown mode, the login screen will be in "exam mode" and distinct from normal (a featureless background with "<code>Lab Examination PC</code>" in bold letters, and the name of the course, exercise, and Storage policy beneath). This should indicate that the machine should be safe to use for the examination process. If certain errors are detected, they will be flagged at the login screen either with a distinctive black background, or as text in (in <span style="color: black; background-color: yellow">safety black/yellow</span>). These should be investigated (or not) as documented. Note that the =site= and =policy= arguments are *case-insensitive* but =course= and =paper= are *case-sensitive* as they represent specific directories in group space. ---++ Switching between Multiple Exams As described above, the =examlockdown-lock= command can be used to switch a machine between course, exam paper or policy without needing to unlock. You will see <pre>>>> LOCKED (CHANGED EXAM DETAILS)</pre>. As usual, you will need to wait for the profile to reach the machine. Rhe examlock component "does the right thing" by clearing out home directories if it detects a change in course or exam paper. Accordingly, *be careful* using this command during exams: it could actively destroy examinees' sessions and data if course or exam are changed. It will not take forcible action if trivial configuration changes, which means it is safe to change storage policy during an exam. Nonetheless please use extreme caution during exams. ---++ Storage Device Policy USB block device/CD/DVD mounting is now permitted in some lab exams, and the Lab Exam lockdown process now supports this fully. Storage Device Mount policy can now be specified for all exam PCs (and in theory overridden per lab/machine but this would require new macro overrides; for now it's unsupported). The =exam-desktop.h= header includes the =exam-mount-rules.h= live header, which generates a named =udev= rules file when the machine is locked down (and deletes it again on _unlocked_ machines). The following USB device policies are available: $ =NOEXAM= : Default OS behaviour. Delete the udev rule. $ =LOG= : Log plug events explicitly (they're always logged implicitly). This allows read/write. $ =BLOCK= : Prevents mounting of all USB / optical devices (and logs usb-level events) $ =READONLY= : Allow mounts from USB / optical devices, but force the block device to be readonly (uses =blockdev= so should behave as if the device is physically =ro=). ---+++ Setting the Policy This can be done on a per-exam (or even per-machine) basis as normal using the =examlockdown-lock= command. See above sections on lockdown for details. If none of the above policies is defined, the profile will throw a =#warning= and the login screen will display a <span style="color: black; background-color: yellow">Mount Policy Undefined</span> warning. Under certain circumstances, you may wish to alter the policy on one machine only. This can be done using the normal lockdown tool, and will not force a wipe or re-lockdown. ---+++ Checking the Policy It's down to the file component to apply this change to all lab machines, so this can be relied upon but it's (always!) wise to check the aggregate logs for file component failures. If you've locked machines down or made changes to policy, you can check that the policy has been deployed by examining the appropriate file (=10-*.rules= as specified in the exam policy header) has been created or deleted in the =/etc/udev/rules.d/= directory, on any machine. The policy is explicitly named in the variable =ENV{EXAM_POLICY}=. ---++ Unlocking Use the command =examlockdown-unlock= in the same way as above (though without course, exercise or policy parameters) to return machines to normal; e.g. <blockquote><b><pre> $ examlockdown-unlock 5.cl-w 5.cl-n wibble </pre></b></blockquote> or an example at Forrest Hill: <blockquote><b><pre> $ examlockdown-unlock --site FH 3.D02 3.D01 </pre></b></blockquote> Like its =-lock= counterpart, this command removes the =#define EXAM[...]= lines from each specified profile or group, and triggers machine reconfiguration. The machine is safe to use again when its login screen returns to the standard one. If the login screen does not change automatically, something has failed, and investigation is advised. <p class="warning"><strong>Beware</strong>: =examlockdown-unlock= will unlock a machine from *any* configured exam and *destroy* data: so if concurrent exams are in progress please be careful to unlock only those rooms required.</p> The =examlockdown-unlock= script will currently not remove =examrestore= files (though the next, unreleased version will do). Manual removal of =/tmp/examrestore= is advised on machines (though its contents are secure, and =/tmp= will be cleared automatically on reboot). ---++ Run-through / testing * Put a machine into examination lockdown mode and: * Check that the upcoming exam’s course and exam type are displayed on the “lockdown” login screen. * Check that the device mount policy is shown correctly, and that device plug/mount events are handled as you would expect. * Run =getpapers= to check that exam paper content placed in the locked-down home directory is as it should be. Test using a dummy file of some sort if papers are not yet available. Test that template files are user-editable. * Run =examsubmit FILE=, where =FILE= is the name of each acceptable file name for the upcoming exam (there does not have to be any meaningful content in =FILE=). * In addition to the above, check that the examination instructions are consistent with reality. Run each command described on the sheet and pay particular attention to documentation paths and template files, checking that they are consistent with the files delivered by =getpapers=. * In the =exambackup=, =examsubmit= and =examreadonly= partitions: * *Make sure that the backup subdirectory structure has been created appropriately.* You should see =exambackup/<course>/<exam>/= and this should be writable by 'submit'. * If the above directories *do* exist it is worth checking that any old backups are archived elsewhere (i.e. another directory of the backup root) to avoid confusion/conflict with backups for the upcoming exam. * Check that the backup nor submit filesystems have plenty of free space. * Remove any lingering presence files in the =/presence= subdirectory, as stale files might confuse any analysis of the upcoming exam (these should be removed automatically on unlock but it's unlikely to work 100% of the time). * Elsewhere on the system: * Make sure that the contents of the exam paper / documentation / template directories are correct. These come from the course organiser and it is worth checking everything is the correct location, and that the exam instructions are correct (see above for run-through checks). You will probably need to liaise with the course organiser and/or remind them to do this. * Make an LCFG release snapshot to coincide with the exam. Machines can then be frozen to this release until after the exam; in this way, any software or configuration changes will not affect the examination hosts until the freeze is lifted. To make a separate LCFG release branch for this purpose, see [[ReleaseManagementProcedures#Branches][Release Management Procedures - Branches]]. This also covers applying the branch, but in summary: * add to the base =live/studentlabs.h= header the line =!profile.release mSET(NAME)=, where =NAME= is the name of the branch on which to freeze. * After the exam, remove the release override to unfreeze. *This must be undone after the exam diet to prevent clients missing further updates*. It may be useful to use the LCFG "/* REMINDER: */" comment system when changing release. * Between exams, firewall holes can become out-of-date as IP addresses change. Once a few machines have been locked down, check the aggregate syslog (see below for details) to determine if important servers or ports are being blocked in a way which will cause problems (for example, if LCFG or RPM servers are being blocked). Edit these IP addresses, if necessary, in the file =live/exam-ips.h=. ---++ The Log Host Exam PCs log remotely to the Informatics log host in the usual manner at all times. However when in exam mode they also log to the special log destination on this machine: =/disk/home/LOGHOST/rsyslog/byInName/tcpPORT.log= where =LOGHOST= is the current Informatics loghost machine name and =PORT= is defined in the =<dice/options/exam-desktop.h>=. Tailing this log provides an overview of all machines' state. A good overview of the exam in progress can be had by tailing (remember to use =tail -F= ) this combined long, grepping for the names of each of the applications involved: =getpapers=, =examrestore=, =exambackup= and =submit=. Other analyses such as device and network logging can prove useful. In general, firewall "block" logs are indicative of a service that should've been stopped but hasn't - we're attempting to eliminate these - but are worth casting an eye over. ---++ Known Issues * *students have US keyboard layout applied* <blockquote>This only affects the first student to log in after a locked-down machine has been rebooted; it can be trivially changed from the top of the menu. Students are informed by the information sheet provided. MPU are investigating the problem.</blockquote> * *students can't access home directory via desktop icon* <blockquote>They can access it via the command line, or via the "Places" menu.</blockquote> * *machine freezes* <blockquote>Pressing Alt+Tab and switching between applications can sometimes bring it back to life.</blockquote> * *no logs appearing on the log host* (Loghost rsyslog failure): <blockquote>we've seen the loghost failing to log some or all lab exam PCs remotely. It usually doesn't affect unlocked machines, but results in a near-empty combined log file. This seems to be associated with an =rsyslog= process stuck at ~100% CPU on =loghost=, so is easy to spot.<br/> *To recover*: (please *Let Infrastructure Unit know* you are planning to do this): use =systemctl= to restart =rsyslogd= on the log host: logs should start to appear automatically but re-locking down / rebooting clients might be necessary in some cases. Inf Unit are investigating this. </blockquote> * *examsubmit doesn't work*: on occasion we've seen an (otherwise fully-functioning) exam machine fail to run =examsubmit= citing a missing =/home/submit/master= file. This seems to be a component race on lockdown, and can be confirmed by a failure in the LCFG file log along the lines of " =failed to create link: /disk/scratch/home/submit= ". The fix is straightforward: run =om file configure= on the affected machine, *twice* if it fails first time, and retry. * *student has a readonly homedir; references to /afs paths in logs*: This is another systemd race involving autofs / ldap. It's described at LabExamsProc#Your_Machine_Might_Not_Be_Locked - but as with most other problems a *reboot* is all that's required to ensure a fix. * The lockdown process only takes a couple of minutes from executing the lockdown command to all machines picking up the new profiles and re-configuring. However, the issue above necessitates extra time to allow the rebooting or swapping out of any machines which fail to lock. * Watch out for big profile rebuilds - what normally takes just a few minutes could be delayed by half an hour or more depending on what’s rebuilding. Locking down all AT lab machines presently takes about six minutes to complete if otherwise unhindered. * Some machines do not see re-configurations and therefore refuse to lock or unlock. On SL7, *don't* press =ctrl-alt-backspace= to restart the display manager: it might _appear_ to work, but will just mask the problem (likely a =systemd= service deadlock, rather than anything exam-specific). It's probably not worth trying to diagnose at first: *reboot* the machine; if it comes back locked, it will be OK. * The =getpapers= command is *not installed* on a machine’s first DICE install. It will also not be installed on a reboot, since updaterpms runs before LDAP and this prevents appropriate permissions being set on the package’s files. The solution is to make sure that all machines participating in an exam lockdown have either been in service for 24 hours (therefore completing an overnight updaterpms run), or have had =om updaterpms run= performed manually, prior to an exam. *TODO: use an =examlock.exec_pre= script to schedule an updaterpms run on lockdown* * <a name="RestrictingBackups"><b>Restricting Backups</b>:</a> <blockquote> During open-book exams, we have seen students copy the contents from their USB sticks to the local home directory. This can have serious consequences on the backup partitions! You must identify which backups are causing the problem (using =du= in the =exambackup= directory on =examback=). You can then use =less= to look at the backup and identify the likely file or directory that needs to be excluded. You can then exclude this file from backups on an individual user's machine by editing their profile (or, in extreme situations only, in =live/exam-desktop.h=) with the resource =file.tmpl_exback_x=, for example: <pre> /* don't back up bigfile.dat in any directory */ !file.tmpl_exback_x mSET(bigfile.dat) /* do not back up foo or BAR in any directory */ !file.tmpl_exback_x mSETQ('\ foo \n\ BAR \n\ ') </pre> This resource forms the contents of an "exclude file", so the syntax for this resource is the same as for the =tar -X= argument; see =man tar= on SL7 for details. You can check that the excision has been carried out correctly by running the _exambackup_ command as root on the affected machine, comparing the output with a previous backup. </blockquote> ---++ In-Exam Recovery Backups of local home directories on *occupied* exam machines are made every five minutes. Additionally, *all* locked-down machines will report its status via the file =exambackup/presence/<hostname>= which contains the fields =HOSTNAME:course/exam:console-user=. If a host has not refreshed its 'presence' file in over approximately five minutes, it should probably be investigated. If a computer crashes during the exam follow the technical procedure below (invigilators have complementary procedural documentation of which this is a part): * Choose another machine (once a machine has crashed it should not be re-used by any other student). There should always be one room set aside with spare machines, in case there are no other machines in the primary exam room(s). * Log onto the new machine as yourself and recover the student’s most recent backup files (into =/tmp=) by running =examrestore sMATRIC= (where =MATRIC= is the student's matriculation number). You will need the =submit= role in order to do this; most academic staff will have this role, as do some support staff and any !PhD markers. * Log off. * The student should now log onto the new machine. * The student should now run the command =examrestore= (with no arguments this time); this extracts the recovered backup in =/tmp= into their new local home directory. It will also restore<a href="#gr"><sup>[1]</sup></a> getpapers data on their behalf. You can monitor the activity of this command in the usual system log. * Wait while the student confirms their work (and access to exam papers) has been successfully recovered. <b><a name="gr">[1]</a></b>(=getpapers --restore= restores the local exam paper cache on local disk and will fix any broken symlinks in the restored data, without touching the user's recovered home directory. Students should never need to run this manually but it is an important part of exam state restore.) ---++ When things have gone wrong Following an exam, the normal procedure is to unlock the entire lab. Given that this erases all transient exam data from the desktop, it is important that any machines or users "in question" should have their lab PCs <strong>quarantined</strong>: * *do not unlock the desktop or switch exams* - this will erase local data which could be useful * *prevent console login* with =!auth.users mSET(@sysmans)= * *prevent file removal* - though it is still inadvisable to execute the the command, we can avoid the risk of file removal on unlock by setting =!examlock.lock mSET(hold)= (this *must* be below the =studentlabs= header). Though it's inappropriate for long-term use, there might arise a situation in which the machine state must be preserved as-is and the above is insufficient. Rather than altering =examlock.lock= above, in extremis you should: * *wait* for the =auth.users= change above to take effect, and: * *break the profile* - This prevents unlock completely. A clear, unambiguous method is =#error "EXAM QUARANTINE"= in the profile, with an accompanying comment. ---++ At the other end... Teaching staff may enquire as how to collect the exam submissions. There is a handy command to do this called =exam_process_submissions=. The full documentation of =process_submissions= (on which =exam_process_submissions= is based) can be found here: [[http://www.dice.inf.ed.ac.uk/units/rat/documentation/user/submit.html][Using the Practical Submission System]]. Basically, most staff will want to do the following things: * Create a number of files with names =sMATRIC= (where =MATRIC= is the student's matriculation number) in the current working directory, each of which having the corresponding student's exam submission. In this example, we want the submitted files called 'exam.hs' from course inf1-fp and exam rpe (repeat practical exercise): <blockquote><pre> $ exam_process_submissions --comline "cat %f > %u" --file exam.hs inf1-fp rpe </pre></blockquote> * Send each student's submissions of 'exam.hs' from course inf1-fp and exam rpe (repeat practical exercise) to printer =if513m0= and mark the top of the page with the student's martric number and some other relevant information: <blockquote><pre> $ exam_process_submissions --comline "enscript --header='%u' -G -P if513m0 %f" --file exam.hs inf1-fp rpe </pre></blockquote> Unfortunately =exam_process_submissions= can only be found at present on exam machines themselves, and it is no longer usable thanks to changes in =submit=. Both of these issues will be addressed ahead of future exams. ---++ Resilience Rough procedures in case of infrastructure failure (as opposed to individual machine failure): ---+++ Header changes The suggested way to make header changes during exams is to modify the labs individually. Instances of =#define= must be made ahead of the studentlabs include, and resource changes should be made after, for example: <pre> #ifndef LIVE_STUDENTLABS_FH_3_D08 #define LIVE_STUDENTLABS_FH_3_D08 <b>/* move examsubmit & examreadonly to new server */</b> <b>#define NFS1 129.215.nnn.nnn</b> #include <dice/options/studentlabs-at.h> <b>/* override exam mount policy for this lab */</b> <b>!examlock.mountpol mSET(log)</b> [...] </pre> ---+++ submit / backup servers In case of individual machine failure, we can re-route the mounts fairly easily. 1. Add partition to alternative exam mount server * This should involve a one-line profile change: the addition of one of =exambackup=, =examreadonly= or =examsubmit= to the =DICE_OPTIONS_EXAM_NFS_EXPORTS= macro. * If both have failed, you should install a new machine: this should take <10minutes for a fresh-install VM (assuming it's in DNS already). * Make sure it its virtual disk is sized such that =vda4= will be sufficiently large to cope. * The host should include =exam-nfs-server.h= * It should have its EXPORTS macro set to include all three of the mounts * The host must also have its export added to the =partitions= rfemap, along the lines of =hostname1 host.ipv4.address /disk/data/= 2. Alter exam NFS IP address in LCFG * You do not need to alter the DNS aliases since the exam machines don't use this to find the servers - and it could become confusing for other machines during switchover - but if the change is to be permanent you must remember to do this. * Instead just change the IPs of the live servers by predefining =NFS1= (for =examsubmit=, =examreadonly=) and/or =NFS2= (for =exambackup=) on each affected lab (see example lab header change above). 3. This should change all required resources, including firewall and restarting =autofs= to pick up the new mounts. But, if individual machines fail to pick up the changes, you might need to forcibly unmount the broken partition and restart autofs. You can do this remotely through the loghost. Following the completion of the exam, if the change is to be permanent you should change the mount definitions for all other DICE machines. This involves changing the =amdmap/group= and =partition= (SL6 hosts), =automap/group= and and =automap/partition= (SL7 hosts) and =dns/inf= RFE maps, and ensuring the backup stanzas match the file contents, e.g. =MIRROR_THIS_AND_TAPE(exambackup,<%nfs.fs_exambackup%>)=. ---+++ loghost failure 1. =#define LOGHOST <new log server>= in the appropriate lab header (see example lab header change above) and await reconfiguration. This changes the firewall and rsyslog configuration allowing the loghost to receive logs, but crucially also is required for remote access to the machines. Ideally the chosen gateway machine should be capable of handling the combined logs (but it should at least be able to reject the connection requests nicely) as we can't presently separate the two functions (we ought to fix this). ---+++ DNS failure The system should not notice the absence of DNS, though it's possible that some applications could misbehave. No change required. ---+++ LCFG slave failure Assuming no configuration changes are required, this won't affect a running exam. If configuration changes are required, this is a point of failure and only setting up a new slave will fix this. We can reduce likelihood of failure by adding (local) slaves specifically for exams if we believe this is likely. ---+++ Cross-site failure (LCFG slaves accessible) 1. Set up a local host for =exambackup= & =examsubmit=. 2. Proceed as for NFS failure, above. ---+++ Total network failure When there's no point trying to fix things. 1. Students should abandon their machines - they should *not* log out - having *attempted* an =examsubmit= and support will have to manually submit, or assume submissions from automated backups, when the network becomes available again. 2. A procedure for determining the students' intentions to submit is required but again we can use backups, bash history and central logs to determine this. In case of total network failure note that backups will not be taken, so it's possible that individuals' work could be lost in case of concurrent machine failure. We are considering ways of working around this for future system enhancements. ---+++ Fire Alarm / Power Failure An interrupted exam _could_ proceed following resolution of either a power failure or fire alarm, including loss of individual machines or local data following power problems (though, obviously, unsaved work in RAM will not be saved by the backup process). Standard computing procedures documented here should be followed to power on and/or restore data as appropriate. How (or if) the exam _should_ proceed following resolution of such events is for Student Services to determine in every case. ---+ Wish list... It would be good if: * Java submissions could be run through JUnit testing as part of the submission process, rejecting those that don't pass with appropriate feedback * <em><strong>Response</strong>: Computing Support must not put in place any mechanism which might prevent submission, and we are not in a position to write any sort of code to assess whether submissions are valid. If exam-setters wish to provide a testing script or utility, this is fine; it can be delivered by =getpapers= and students instructed to run it on any code before submission. Theoretically =submit= could be modified to call a check script before submission, but the contents of that check would be exam-specific and responsibility of the setter, and students *must not* be prevented from submitting failing code should they choose to do so.</em> * James Cheney pointed out that the feedback email pointed him to =/home/submit/submissions/lp/default/jcheney/pe/exam.pl=, but it the path should be =/group/examsubmit/submissions/lp= * <em><strong>Response</strong>: This is one many problems with the feedback email, whose continued existence is in question as it can be misleading, particularly if PCs remain locked down for long periods of time. Modification requires touching the =submit= codebase (which I'm reluctant to do) but it's a valid concern.</em> <strong>Update</strong>: As of Nov 2015 feedback should normally be delivered as soon as the exam PC is unlocked, rather than many hours later as was previously the case. <strong>Update</strong>: As of Aug 2016 Director of Computing (Perdita) has requested that the confirmation emails be turned off. pending December diet, RT:78588. * examlockdown command should be more automated, and perform more checks (not changing hosts with differing exam data, for instance). * Philip Wadler suggests the message users get when they re-submit is incorrect - "The file to copy to (/home/submit/submissions/course/default/user/mpe/Question1.java) already exists. Overwrite it (y/n: n aborts)? y File successfully overwritten." This reads as though the user has overwritten the "file to copy" i.e the file they want to submit, rather than the file already submitted. * implement dialog-driven examsubmit wrapper * ACL-based access to permit broader marker access. * Backup improvements, including dedicated examback partitions and active checks / notifications. * Test script to confirm validity of =exam-ips.h= Planned: * Improved submit logging and feedback: RT:65462 * Split =submit= and =exam-submit= UIDs so we don't need a local override: RT:75292 * Change the examlockdown script so that you can't lock down to an exam that doesn't exist. The script could check that the path to the papers exists e.g for =inf1-op= / =pe1= it would check for =/disk/data/examsubmit/getpapers/inf1-op/pe1= * Media checks: * Force-dismounting, or notification, on lock / logout / unlock. * Better notification of forgotten devices. * remove email confirmation entirely: RT:78588 <br/>-- Main.TimColles - 23 Nov 2007 <br/>-- Main.GrahamDutton - 08 Dec 2015
Edit
|
Attach
|
P
rint version
|
H
istory
:
r87
<
r86
<
r85
<
r84
<
r83
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r84 - 23 Jan 2019 - 13:45:01 -
GrahamDutton
DICE
DICE Web
DICE Wiki Home
Changes
Index
Search
Meetings
CEG
Operational
Computing Projects
Technical Discussion
Units
Infrastructure
Managed Platform
Research & Teaching
Services
User Support
Other
Service Catalogue
Platform upgrades
Procurement
Historical interest
Emergencies
Critical shutdown
Where's my software?
Pandemic planning
This is
WebLeftBar
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback
This Wiki uses
Cookies