This area is for users to share tips and tricks about using the local Grid Engine setup.

Local official documentation for Grid Engine can be found at http://www.inf.ed.ac.uk/systems/beowulf/doc/gridengine.html For cluster specific info see hermes, lion, lutzow or townhill.

The basics

  • What clusters are available?
    • In preparation for moving to the forum the clusters have been re-organised into two
      • The desktop beowulf machines (GX240/P530) with head nodes lion, lutzow and moselle
      • The rackmount beowulf machines (PE1425) with head nodes hermes, townhill and seville
    • It's expected that the rackmounted machines will be moved to the forum server room within a couple of months of the move to the forum being finished, the older desktop machines will take longer to move.

  • GridEngine is configured so that the cluster cores each have the same amount of memory (1Gb per core on all the above clusters). Other configurations would be possible - add this to the wish list if you want it.

You can monitor the load on the above Informatics compute resources at http://bwganglia.inf.ed.ac.uk/ganglia/ From outside Edinburgh, you can still see this web page, thus:

  • ssh -L 8000:bwganglia.inf.ed.ac.uk:80 username@sshREMOVE_THIS.inf.ed.ac.uk
  • now point your browser at localhost:8000/ganglia
  • this has stopped working since the move to FC5 (support ticket 30296)

Misc Tips and tricks

  • A sample script calling qsub is /group/project/ami3/amiasr_shef/asrcore/tools/submitjob.hermes a not so nice one /home/vstrom/software/cstr/scripts/multisyn_build/bin/do_alignment_parallel Both do not make use of "submit scripts" since changing the file name for stderr and stdout does not work then, i.e. "qsub -o fname" works, "qsub submit_script" with "#$ -o fname" in submit_script does not. -- VolkerStrom - 13 Dec 2005

  • Some generic jobs submission advice
    • Sanity check your environment before you start to run your job. If the output is going to a file in ~/longrunning_jobs/results/temp then check that the directory exists and is writable before starting any computation.
    • Write intermediate and results files to /disk/scratch on the local filesystem then copy them back to your home directory, if the copy fails you'll still have a set of results that could be retrieved. -- IainRae - 18 Apr 2006
Topic revision: r12 - 08 May 2008 - 09:48:43 - IainRae
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies