Distributed Computing

An Informatics Computing Innovation Meeting: BP Conference Suite, Thursday 2nd August 2007, 14:00-17:00

This is the second in an occasional series of meeting within Informatics to look at ways in which we can move forward with new developments in the Informatics Computing Service. The theme of the meeting is Distributed Computing - cluster computing, Gridengine and Condor.

Many researchers within the School use some form of distributed computing already, but there is not a lot of information about the clusters we have and how to make best use of them. There is also now a University cluster provision using Gridengine in the form of ECDF and any future developments in Informatics need to be viewed in that context.

The meeting will be chaired by Steve Renals and although there will be a few very brief presentations it will, however, be mostly question and answer discussion based. Also attending will be representatives from IS for the ECDF project. There will be an opportunity for everyone to present their research requirements.

The aim of the meeting is twofold. Firstly to disseminate information about distributed computing facilities in the School and University. Secondly to obtain a good view of what research requirements for distributed computing will be in the future and what direction we should be taking to meet them.

Before this meeting takes place we would like to get a rough idea of how people use the existing clusters and what their future requirements might be. This will be used to help focus discussion. To this end there is a web form with a few questions that should only take a few minutes to complete. If everyone who has an interest in distributed computing facilities could complete this form before the end of Wednesday next week, even if you are not able to attend the meeting, it would be very helpful. The URL for the web form is:

http://www.dice.inf.ed.ac.uk/units/research_and_teaching/distcomp/

If you will not be able to make the meeting but would like to have some input, please feel free to mail me with comments, suggestions and requirements if the web form is not suitable.

Agenda

The meeting will be chaired by Steve Renals.

Brief Presentations + Q&A

  1. Overview of Current Cluster Provision - Tim
  2. ECDF Storage & Compute Services - Orlando
  3. SAN Space - Craig
  4. GPFS - Iain
  5. Gridengine Scheduling - Iain

Break & Coffee

Hot Topics Discussion

  1. Survey Response - Tim
  2. Topics Prioritization
  3. Discuss - Underlying Filesystems (GPFS, AFS, ECDF/Desktop shared space)
  4. Discuss - Scientific Linux 5 on Clusters
  5. Discuss - What do people need ECDF at present does not provide?
  6. Discuss - Other topics as prioritized

Actions

  1. Prioritize active/pending work

Comments, Suggestions, Requirements

Some potential discussion items. Not in any priority order. It is unlikely there will be an opportunity to discuss all of these at the meeting so we will prioritize them beforehand and at the meeting itself.

  • purpose of meeting, what we want to achieve
  • brief summary of existing cluster and condor provision
  • stats - how much are the clusters and condor used
  • gridengine stats - plus stuff Iain is doing on scheduling
  • underlying filesystems
  • how best to use ECDF/eddie - impact on our own clusters
  • sharing our cluster filespace with ECDF filespace
  • clusters for researching clusters (not something suitable for ECDF)
  • user requirements now and in the future
  • use of Scientific Linux on our own clusters (to match ECDF)
  • use of GPFS
  • usage visualisation tools (Ganglia, Condor View?)
  • prioritized list of things to do
  • considering "transfer queues" for shifting jobs between clusters
  • more submit nodes per cluster
  • merging clusters
  • AFS credentials - issues/solutions
  • should we continue allowing direct access (not via gridengine)
  • why have separate home directory on clusters
  • future requirements - large memory (32GB), 64bit?
  • get ganglia available again (broke on fc5)
  • GPFS has security issues, OS it supports issues and license issues
  • currently no accounting - might be useful to have some
  • will ECDF meet our needs or does Informatics need to purchase a new cluster
  • what are disk spare requirements, and does it need backed up (ECDF uses mirrored disks so costs more)
  • external service providers - reasons for not using

Presentations

Notes from the meeting

-- TimColles - 23 Jul 2007

Edit | Attach | Print version | History: r8 | r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r4 - 31 Jul 2007 - 15:43:05 - TimColles
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies