TWiki> DICE Web>PandemicPlanning>HadoopClusters (revision 17)EditAttach

Hey! This is TOTALLY NOT FINISHED. Comments welcome, to Chris please.

The DICE Hadoop clusters

  • We have Hadoop clusters.
  • They run on DICE.
  • Their configuration is done almost entirely with LCFG.
  • Hadoop is needed by the Extreme Computing (EXC) module.
  • Hadoop provides facilities for processing massive amounts of data in parallel. One of the popular ways is MapReduce, which is when a dataset is split up programmatically into many parts, each of which is then processed in some way, with the results then all being collected back together. (Read more at wikipedia.)

This page

This page sets out a basic picture of what Hadoop is and how to make our Hadoop clusters do their stuff. This necessarily makes it pretty long; sorry.

The LCFG hadoop headers.

Each node in a Hadoop cluster includes two special LCFG headers:
  1. A header to say which cluster this node is in:
    • live/hadoop-exc-cluster.h
    • live/hadoop-exctest-cluster.h
    • live/hadoop-devel-cluster.h
  2. A header to say what job this node plays in the cluster:
    • dice/options/hadoop-cluster-slave-node.h
    • dice/options/hadoop-cluster-master-hdfs-node.h
    • dice/options/hadoop-cluster-master-yarn-node.h
Each node does just one of these three jobs.

There are quite a few more LCFG Hadoop headers. Some are included by the above headers. Others are left over from previous configurations and are now out of use.

Find out which computers are in a cluster.

Just search the LCFG profiles to find out which machines are using the cluster's LCFG header:
$ profmatch hadoop-devel-cluster
(profmatch is in /afs/

Find out which computer does which job.

Again, search the LCFG profiles for the names of hadoop headers:
$ profmatch devel-cluster slave-node
$ profmatch exctest-cluster master-yarn-node

Hadoop's constituent parts.


HDFS is the (Hadoop distributed) filesystem. Data files are stored there. They're distributed across the cluster's nodes. HDFS keeps several copies of each file, and tries to ensure that the copies are on multiple nodes and on multiple racks.

In our simple cluster configuration, one machine in a Hadoop cluster is an HDFS namenode and most of the others are datanodes. (More complex clusters than ours will have secondary or backup namenodes too, for extra reliability.)

If you want to know more, start with HDFS Architecture and the HDFS User Guide.

HDFS namenode
The namenode stores each file's metadata (its name, its owner, its position in the HDFS directory hierarchy, etc.). It also keeps track of where all the data is stored. Each of our clusters has one namenode.
The header for a namenode is dice/options/hadoop-cluster-master-hdfs-node.h.
To see the namenode daemon's log:
  • Identify the hostname of the namenode for this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/hdfsdata/hadoop/logs/
The namenode process should be owned by hdfs and should look something like this (output from ps axuww):
hdfs      6578  0.9  5.4 3922780 221608 ?      SNl  Feb21 469:17 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_namenode -Xmx1500m -server -Dhadoop.log.dir=/disk/scratch/hdfsdata/hadoop/logs -Dhadoop.home.dir=/opthadoop/hadoop-2.9.2 -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Xmx2000m -Xmx2000m -Xmx2000m -Xmx2000m -Xmx2000m -Xmx2000m,RFAS org.apache.hadoop.hdfs.server.namenode.NameNode
The namenode's systemd service is called hadoop-namenode.
HDFS datanode
The datanodes store data files. They co-operate to arrange for there to be multiple copies of each file. The cluster's "slave" nodes are all datanodes. (The idea is that the data is, as much as possible, on the local disk of the machine that's analysing that data.)
The header for a datanode is dice/options/hadoop-cluster-slave-node.h.
To see a datanode daemon's log:
  • Identify the hostname of a slave node in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/hdfsdata/hadoop/logs/
The datanode process should be owned by hdfs and should look something like this (output from ps axuww):
hdfs      6616  0.9  6.3 3361284 258472 ?      SNl  Feb21 461:02 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_datanode -Xmx1500m -server -Dhadoop.log.dir=/disk/scratch/hdfsdata/hadoop/logs -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dhadoop.policy.file=hadoop-policy.xml -server,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode
The datanode's systemd service is called hadoop-datanode.
Each datanode presents information on the web at
Note that the web interfaces on the EXC cluster are not accessible from non-cluster machines because the cluster machines are on a non-routed wire, as per school policy.


YARN manages the cluster's resources. It decides which jobs will run where, and provides the resources to enable those jobs to run.
YARN resource manager.
Each of our clusters has one of these. It's the ultimate arbiter for all the cluster's resources, deciding what job will run where and monitoring each job's progress.
The resource manager header is dice/options/hadoop-cluster-master-yarn-node.h.
To see the resource manager daemon's log:
  • Identify the hostname of the resource manager in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/yarn/logs/
The resource manager process should be owned by yarn and should look something like this (output from ps axuww):
yarn      6589  4.0  7.2 6203536 294048 ?      Sl   Feb21 1923:10 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_resourcemanager -Xmx4000m -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dyarn.home.dir= -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dyarn.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -classpath /opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs/lib/*:/opt/hadoop/hadoop-2.9.2/sharehadoop/hdfs/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/conf/rm-config/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/timelineservice/lib/* org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
The resource manager's systemd service is called hadoop-resourcemanager.
YARN node manager.
Most of each cluster's nodes - what we've named the "slave" nodes - will run a YARN node manager. This monitors the node's resources for the resource manager.
The node manager header is dice/options/hadoop-cluster-slave-node.h.
To see a nodemanager daemon's log:
  • Identify the hostname of a slave node in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/yarn/logs/
The node manager process should be owned by yarn and should look something like this (output from ps axuww):
yarn      6607  1.7  9.2 6092312 374284 ?      Sl   Feb21 816:21 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_nodemanager -Xmx4000m -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dyarn.home.dir= -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dyarn.policy.file=hadoop-policy.xml -server -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dyarn.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -classpath /opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/conf/nm-config/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/timelineservice/lib/* org.apache.hadoop.yarn.server.nodemanager.NodeManager
The node manager's systemd service is called hadoop-nodemanager.

Map Reduce

The Map Reduce framework is the classic, but not the only, algorithm for distributed processing of big data on Hadoop. It has one permanent daemon, the job history daemon; otherwise it's just fired up as needed by jobs.
The Map Reduce Job History daemon
This runs on the YARN master server alongside the YARN resource manager.
The job history daemon is added in the resource manager header, which is dice/options/hadoop-cluster-master-yarn-node.h.
To see the job history daemon's log:
  • Identify the hostname of the resource manager in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/mapred/logs/
The job history daemon's process should be owned by mapred and should look something like this (output from ps axuww):
mapred    57347  0.2  0.5 3522708 562280 ?      Sl   May07   9:23 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_historyserver -Xmx1500m -server -Dhadoop.log.dir=/disk/scratch/hdfsdata/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/disk/scratch/mapred/logs -Dhadoop.root.logger=INFO,console -Dmapred.jobsummary.logger=INFO,JSA,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer
The job history daemon's systemd service is called hadoop-mapred.

The configuration files

To see the Hadoop configuration files once they've been made by LCFG, login to a cluster node then cd /opt/hadoop/conf. Bear in mind that the content of these files depends on the machine's role in the cluster - the YARN master's configuration will be different to the HDFS master's, and a slave's will be different again.

These files are made by the hadoop and file components.

Hadoop daemons are started and stopped by systemd. The systemd service files are created by the LCFG systemd component using cpp macros which are defined in dice/options/hadoop-cluster-node.h and used in live/hadoop-cluster-whatever-node.h headers.

Privileged or "root" access

To do basic things to a Hadoop daemon, like start, stop or restart it, just use systemctl with the daemon's systemd service name (given above). For example,
systemctl status hadoop-mapred
systemctl restart hadoop-datanode
For anything more involved, you'll need privileged access. You get this simply by using a shell which has the same account, group, and Kerberos context that the daemon uses. For example to have superuser privilege over a datanode daemon, do this on a Hadoop slave node:
export KRB5CCNAME=/tmp/hdfs.${LOGNAME}
nsu hdfs
newgrp hadoop
kinit -k -t /etc/hadoop.dn.keytab dn/${HOSTNAME}
That's for privilege over the datanode daemon. For other daemons, refer to the table below to find the correct account and abbreviation to use.

daemon account abbreviation host
namenode hdfs nn the HDFS master
datanode hdfs dn slaves
resource manager yarn rm the YARN master
node manager yarn nm slaves
job history service mapred jhs the YARN master

How to run a test job

First create yourself some user filespace on HDFS:

  • ssh onto the cluster's HDFS master node.
  • Get superuser privilege for the HDFS namenode:
      nsu hdfs
      newgrp hadoop
      export KRB5CCNAME=/tmp/hdfs.${LOGNAME}
      kinit -k -t /etc/hadoop.nn.keytab nn/${HOSTNAME}
  • Make your HDFS homedir:
      hdfs dfs -mkdir /user/${USER}
      hdfs dfs -chown ${USER} /user/${USER}
Now run the test job.

  • ssh onto the cluster's YARN master node.
  • Copy a bunch of Hadoop source files to your HDFS dir:
      hdfs dfs -put $HADOOP_PREFIX/etc/hadoop input
      hdfs dfs -ls input
  • Only do this next step if you've already run the job and want to rerun it - it removes the output dir.
      hdfs dfs -rm -r output
  • Fire off your Hadoop job:
      hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar grep input output 'dfs[a-z.]+'
  • Copy the output from HDFS and examine it - you should see some word counts.
      hdfs dfs -get output
      cd output

Useful commands

For commands which are useful throughout Hadoop: For commands specific to HDFS: For YARN commands:

Decommissioning a node

If you need to take a machine down for a period, perhaps because its hardware is faulty, you can remove it from Hadoop by decommissioning it. This is handled separately in HDFS and YARN.


First mark your node as excluded by adding this to its LCFG file:
      !hadoop.excluded   mSET(true)
Then tell HDFS to re-read the list of excluded nodes.
      ssh <HDFS master>
      hdfs dfsadmin -refreshNodes
The namenode log should announce "Starting decommission of ...". It then ensures that all data on the node is adequately replicated on the remaining cluster. If the node has a lot of data stored on it this may take a while. When it's complete, the namenode log will announce "Decommissioning complete for node ...".


Still working on this.
Edit | Attach | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r17 - 16 Dec 2019 - 12:04:37 - ChrisCooke
DICE.HadoopClusters moved from DICE.HadoopPandemic on 25 Mar 2019 - 14:33 by ChrisCooke - put it back
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies