Hadoop Cluster: Care and Feeding

If you just want to use Hadoop, see computing.help instead.
This page covers maintenance and configuration of the Hadoop EXC cluster.
Note that this page is out of date and is currently being revised contact cc@infREMOVE_THIS.ed.ac.uk for more information.


The machines are LCFG-maintained DICE servers running the current desktop version of DICE.

Machine Role Account Keytab Abbreviation
scutter01 The namenode (the master node for the HDFS filesystem). hdfs /etc/hadoop.nn.keytab nn
scutter02 The resource manager (the master node for the YARN resource allocation system).
The job history server.
A datanode (stores HDFS data).
A node manager (manages YARN and jobs on this node).

The nodes are in the AT server room.


The cluster uses Kerberos for authentication. Before you can do any maintenance work on the cluster, you'll need to authenticate with the appropriate credentials. To do this, you'll need to know the right machine and account and keytab and keytab abbreviation to use. Find them in the above table. Once you have them, follow these general instructions:

  • ssh machine
  • nsu account
  • newgrp hadoop
  • export KRB5CCNAME /tmp/account.cc
  • kinit -k -t keytab abbreviation/${HOSTNAME}

For example, to get privileged access to the namenode you would do this:

  • ssh scutter01
  • nsu hdfs
  • newgrp hadoop
  • export KRB5CCNAME=/tmp/hdfs.cc
  • kinit -k -t /etc/hadoop.nn.keytab nn/${HOSTNAME}

Running a Test Job

... to check that the cluster is working.

First, create user filespace on HDFS. You only need to do this once per user per cluster. Start by logging in to the namenode with ssh and acquiring privileged access to HDFS as per the instructions above. Then, make yourself an HDFS home directory:

 hdfs dfs -mkdir /user/${USER}
 hdfs dfs -chown ${USER} /user/${USER}
Now ssh to the YARN master node:
 ssh scutter02

Put some files into your HDFS dir. These will act as input for the test job:

 hdfs dfs -put $HADOOP_PREFIX/etc/hadoop input
List your files to check that they got there:
 hdfs dfs -ls input
Only do this next command if you have already run the job and you want to rerun it - it removes the output dir which the job makes.
 hdfs dfs -rm -r output

Now submit the test job:

 hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar grep input output 'dfs[a-z.]+'

Once it's finished, transfer the job's output from HDFS:

 hdfs dfs -get output
... and take a look at what the job did:
 cd output
You should see two files - an empty file called _SUCCESS and a file with a few word counts in it called part-r-00000. if you don't see _SUCCESS then the job didn't work.
Edit | Attach | Print version | History: r55 | r39 < r38 < r37 < r36 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r37 - 20 Sep 2019 - 15:01:36 - ChrisCooke
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies