Hey! This is TOTALLY NOT FINISHED. Comments welcome, to Chris please.

The DICE Hadoop clusters

  • We have Hadoop clusters.
  • They run on DICE.
  • Their configuration is done almost entirely with LCFG.
  • Hadoop is needed by the Extreme Computing (EXC) module.
  • Hadoop provides facilities for processing massive amounts of data in parallel. One of the popular ways is MapReduce, which is when a dataset is split up programmatically into many parts, each of which is then processed in some way, with the results then all being collected back together. (Read more at wikipedia.)

This page

This page sets out a basic picture of what Hadoop is and how to make our Hadoop clusters do their stuff. This necessarily makes it pretty long; sorry.

Software

We use the latest stable version of Hadoop, 2.9.2 at the time of writing. Downloads and project info can be found at https://hadoop.apache.org/. We install it using a tarball of the binary distribution from that site, by means of a home-rolled hadoop RPM.

Documentation

  • The definitive Apache Hadoop documentation is at https://hadoop.apache.org/docs/stable/. Pay attention to the left bar - each link is a manual on a different topic.
  • The University Library has several Hadoop books. Try for instance the most recent edition of the O'Reilly book Hadoop: The Definitive Guide. You can read it online or there's a paper copy.

Our LCFG-managed Hadoop clusters.

There are three of them and they share most of their configuration.
  1. exc is the main Hadoop cluster, the one the students use. It's important to keep this running healthily, at least while it's needed by the students and staff of the Extreme Computing module. It's on physical servers in the AT server room - the scutter machines from the teaching cluster.
  2. exctest is used for testing things before deploying them on the exc cluster. It's used from time to time both by computing staff to test new configurations before putting them on the exc cluster, and by teaching staff to check cluster capabilities or to run through coursework before handing it out to students. Its nodes are virtual machines.
  3. devel is used when developing new configurations. It is only ever used by computing staff and it can be trashed with impunity. Its nodes are virtual machines.

The LCFG hadoop headers.

Each node in a Hadoop cluster includes two special LCFG headers:
  1. A header to say which cluster this node is in:
    • live/hadoop-exc-cluster.h
    • live/hadoop-exctest-cluster.h
    • live/hadoop-devel-cluster.h
  2. A header to say what job this node plays in the cluster:
    • dice/options/hadoop-cluster-slave-node.h
    • dice/options/hadoop-cluster-master-hdfs-node.h
    • dice/options/hadoop-cluster-master-yarn-node.h
Each node does just one of these three jobs.

There are quite a few more LCFG Hadoop headers. Some are included by the above headers. Others are left over from previous configurations and are now out of use.

Find out which computers are in a cluster.

Just search the LCFG profiles to find out which machines are using the cluster's LCFG header:
$ profmatch hadoop-devel-cluster
cannoli
clafoutis
strudel
teurgoule
tiramisu
trifle
(profmatch is in /afs/inf.ed.ac.uk/group/cos/utils.)

Find out which computer does which job.

Again, search the LCFG profiles for the names of hadoop headers:
$ profmatch devel-cluster slave-node
cannoli
strudel
tiramisu
trifle
$ profmatch exctest-cluster master-yarn-node
rat2

Hadoop's constituent parts.

HDFS.

HDFS is the (Hadoop distributed) filesystem. Data files are stored there. They're distributed across the cluster's nodes. HDFS keeps several copies of each file, and tries to ensure that the copies are on multiple nodes and on multiple racks.

In our simple cluster configuration, one machine in a Hadoop cluster is an HDFS namenode and most of the others are datanodes. (More complex clusters than ours will have secondary or backup namenodes too, for extra reliability.)

If you want to know more, start with HDFS Architecture and the HDFS User Guide.

HDFS namenode
The namenode stores each file's metadata (its name, its owner, its position in the HDFS directory hierarchy, etc.). It also keeps track of where all the data is stored. Each of our clusters has one namenode.
Header
The header for a namenode is dice/options/hadoop-cluster-master-hdfs-node.h.
Log
To see the namenode daemon's log:
  • Identify the hostname of the namenode for this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/hdfsdata/hadoop/logs/hadoop-hdfs-namenode-HOSTNAME.inf.ed.ac.uk.out
Process
The namenode process should be owned by hdfs and should look something like this (output from ps axuww):
hdfs      6578  0.9  5.4 3922780 221608 ?      SNl  Feb21 469:17 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_namenode -Xmx1500m -server -Dhadoop.log.dir=/disk/scratch/hdfsdata/hadoop/logs -Dhadoop.log.file=hadoop-hdfs-namenode-rat1.inf.ed.ac.uk.log -Dhadoop.home.dir=/opthadoop/hadoop-2.9.2 -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx2000m -Dcom.sun.management.jmxremote -Xmx2000m -Xmx2000m -Dcom.sun.management.jmxremote -Xmx2000m -Xmx2000m -Dcom.sun.management.jmxremote -Xmx2000m -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.NameNode
systemd
The namenode's systemd service is called hadoop-namenode.
HDFS datanode
The datanodes store data files. They co-operate to arrange for there to be multiple copies of each file. The cluster's "slave" nodes are all datanodes. (The idea is that the data is, as much as possible, on the local disk of the machine that's analysing that data.)
Header
The header for a datanode is dice/options/hadoop-cluster-slave-node.h.
Log
To see a datanode daemon's log:
  • Identify the hostname of a slave node in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/hdfsdata/hadoop/logs/hadoop-hdfs-datanode-HOSTNAME.inf.ed.ac.uk.out
Process
The datanode process should be owned by hdfs and should look something like this (output from ps axuww):
hdfs      6616  0.9  6.3 3361284 258472 ?      SNl  Feb21 461:02 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_datanode -Xmx1500m -server -Dhadoop.log.dir=/disk/scratch/hdfsdata/hadoop/logs -Dhadoop.log.file=hadoop-hdfs-datanode-rat5.inf.ed.ac.uk.log -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode
systemd
The datanode's systemd service is called hadoop-datanode.
Web
Each datanode presents information on the web at
https://HOSTNAME.inf.ed.ac.uk:50475/datanode.html
Note that the web interfaces on the EXC cluster are not accessible from non-cluster machines because the cluster machines are on a non-routed wire, as per school policy.

YARN.

YARN manages the cluster's resources. It decides which jobs will run where, and provides the resources to enable those jobs to run.
YARN resource manager.
Each of our clusters has one of these. It's the ultimate arbiter for all the cluster's resources, deciding what job will run where and monitoring each job's progress.
Header
The resource manager header is dice/options/hadoop-cluster-master-yarn-node.h.
Log
To see the resource manager daemon's log:
  • Identify the hostname of the resource manager in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/yarn/logs/yarn-yarn-resourcemanager-HOSTNAME.inf.ed.ac.uk.out
Process
The resource manager process should be owned by yarn and should look something like this (output from ps axuww):
yarn      6589  4.0  7.2 6203536 294048 ?      Sl   Feb21 1923:10 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_resourcemanager -Xmx4000m -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dhadoop.log.file=yarn-yarn-resourcemanager-rat2.inf.ed.ac.uk.log -Dyarn.log.file=yarn-yarn-resourcemanager-rat2.inf.ed.ac.uk.log -Dyarn.home.dir= -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dhadoop.log.file=yarn-yarn-resourcemanager-rat2.inf.ed.ac.uk.log -Dyarn.log.file=yarn-yarn-resourcemanager-rat2.inf.ed.ac.uk.log -Dyarn.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -classpath /opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs/lib/*:/opt/hadoop/hadoop-2.9.2/sharehadoop/hdfs/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/conf/rm-config/log4j.properties:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/timelineservice/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/timelineservice/lib/* org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
systemd
The resource manager's systemd service is called hadoop-resourcemanager.
YARN node manager.
Most of each cluster's nodes - what we've named the "slave" nodes - will run a YARN node manager. This monitors the node's resources for the resource manager.
Header
The node manager header is dice/options/hadoop-cluster-slave-node.h.
Log
To see a nodemanager daemon's log:
  • Identify the hostname of a slave node in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/yarn/logs/yarn-yarn-nodemanager-HOSTNAME.inf.ed.ac.uk.out
Process
The node manager process should be owned by yarn and should look something like this (output from ps axuww):
yarn      6607  1.7  9.2 6092312 374284 ?      Sl   Feb21 816:21 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_nodemanager -Xmx4000m -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dhadoop.log.file=yarn-yarn-nodemanager-rat5.inf.ed.ac.uk.log -Dyarn.log.file=yarn-yarn-nodemanager-rat5.inf.ed.ac.uk.log -Dyarn.home.dir= -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dyarn.policy.file=hadoop-policy.xml -server -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dyarn.log.dir=/disk/scratch/yarn/logs -Dhadoop.log.file=yarn-yarn-nodemanager-rat5.inf.ed.ac.uk.log -Dyarn.log.file=yarn-yarn-nodemanager-rat5.inf.ed.ac.uk.log -Dyarn.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -classpath /opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/conf:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/common/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/hdfs/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/lib/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/mapreduce/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/lib/*:/opt/hadoop/conf/nm-config/log4j.properties:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/timelineservice/*:/opt/hadoop/hadoop-2.9.2/share/hadoop/yarn/timelineservice/lib/* org.apache.hadoop.yarn.server.nodemanager.NodeManager
systemd
The node manager's systemd service is called hadoop-nodemanager.

Map Reduce

The Map Reduce framework is the classic, but not the only, algorithm for distributed processing of big data on Hadoop. It has one permanent daemon, the job history daemon; otherwise it's just fired up as needed by jobs.
The Map Reduce Job History daemon
This runs on the YARN master server alongside the YARN resource manager.
Header
The job history daemon is added in the resource manager header, which is dice/options/hadoop-cluster-master-yarn-node.h.
Log
To see the job history daemon's log:
  • Identify the hostname of the resource manager in this cluster (see above).
  • ssh HOSTNAME
  • less /disk/scratch/mapred/logs/mapred-mapred-historyserver-HOSTNAME.inf.ed.ac.uk.out
Process
The job history daemon's process should be owned by mapred and should look something like this (output from ps axuww):
mapred    57347  0.2  0.5 3522708 562280 ?      Sl   May07   9:23 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_historyserver -Xmx1500m -server -Dhadoop.log.dir=/disk/scratch/hdfsdata/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/hadoop-2.9.2 -Dhadoop.id.str=mapred -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/hadoop/hadoop-2.9.2/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/disk/scratch/mapred/logs -Dhadoop.log.file=mapred-mapred-historyserver-scutter02.inf.ed.ac.uk.log -Dhadoop.root.logger=INFO,console -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer
systemd
The job history daemon's systemd service is called hadoop-mapred.

The configuration files

To see the Hadoop configuration files once they've been made by LCFG, login to a cluster node then cd /opt/hadoop/conf. Bear in mind that the content of these files depends on the machine's role in the cluster - the YARN master's configuration will be different to the HDFS master's, and a slave's will be different again.

These files are made by the hadoop and file components.

Hadoop daemons are started and stopped by systemd. The systemd service files are created by the LCFG systemd component using cpp macros which are defined in dice/options/hadoop-cluster-node.h and used in live/hadoop-cluster-whatever-node.h headers.

Privileged or "root" access

To do basic things to a Hadoop daemon, like start, stop or restart it, just use systemctl with the daemon's systemd service name (given above). For example,
systemctl status hadoop-mapred
systemctl restart hadoop-datanode
For anything more involved, you'll need privileged access. You get this simply by using a shell which has the same account, group, and Kerberos context that the daemon uses. For example to have superuser privilege over a datanode daemon, do this on a Hadoop slave node:
export KRB5CCNAME=/tmp/hdfs.${LOGNAME}
nsu hdfs
newgrp hadoop
kinit -k -t /etc/hadoop.dn.keytab dn/${HOSTNAME}
That's for privilege over the datanode daemon. For other daemons, refer to the table below to find the correct account and abbreviation to use.

daemon account abbreviation hostSorted ascending
datanode hdfs dn slaves
node manager yarn nm slaves
namenode hdfs nn the HDFS master
resource manager yarn rm the YARN master
job history service mapred jhs the YARN master

How to run a test job

First create yourself some user filespace on HDFS:

  • ssh onto the cluster's HDFS master node.
  • Get superuser privilege for the HDFS namenode:
      nsu hdfs
      newgrp hadoop
      export KRB5CCNAME=/tmp/hdfs.${LOGNAME}
      kinit -k -t /etc/hadoop.nn.keytab nn/${HOSTNAME}
  • Make your HDFS homedir:
      hdfs dfs -mkdir /user/${USER}
      hdfs dfs -chown ${USER} /user/${USER}
      exit
      exit
      logout
Now run the test job.

  • ssh onto the cluster's YARN master node.
  • Copy a bunch of Hadoop source files to your HDFS dir:
      hdfs dfs -put $HADOOP_PREFIX/etc/hadoop input
      hdfs dfs -ls input
  • Only do this next step if you've already run the job and want to rerun it - it removes the output dir.
      hdfs dfs -rm -r output
  • Fire off your Hadoop job:
      hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar grep input output 'dfs[a-z.]+'
  • Copy the output from HDFS and examine it - you should see some word counts.
      hdfs dfs -get output
      cd output
      ls

Useful commands

For commands which are useful throughout Hadoop: For commands specific to HDFS: For YARN commands:

Decommissioning a node

If you need to take a machine down for a period, perhaps because its hardware is faulty, you can remove it from Hadoop by decommissioning it. This is handled separately in HDFS and YARN.

in HDFS

First mark your node as excluded by adding this to its LCFG file:
      !hadoop.excluded   mSET(true)
Then tell HDFS to re-read the list of excluded nodes.
      ssh <HDFS master>
      hdfs dfsadmin -refreshNodes
The namenode log should announce "Starting decommission of ...". It then ensures that all data on the node is adequately replicated on the remaining cluster. If the node has a lot of data stored on it this may take a while. When it's complete, the namenode log will announce "Decommissioning complete for node ...".

in YARN

Still working on this.
Topic revision: r16 - 24 May 2019 - 10:38:29 - ChrisCooke
DICE.HadoopClusters moved from DICE.HadoopPandemic on 25 Mar 2019 - 14:33 by ChrisCooke - put it back
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies