Hadoop Cluster: Care and Feeding

Hadoop users! To learn about Hadoop at Informatics, go to computing.help.inf.ed.ac.uk/hadoop-cluster.

If you're not computing staff, you've reached the wrong page. This page is about how to maintain and configure the Hadoop EXC cluster.

Nodes

The machines are LCFG-maintained DICE servers running the current desktop version of DICE.

Machine Role Account Abbreviation
scutter01 The namenode (the master node for the HDFS filesystem). hdfs nn
scutter02 The resource manager (the master node for the YARN resource allocation system).
The job history server.
yarn
mapred
rm
jhs
scutter03
to
scutter12
The compute nodes. These run:
a datanode (stores HDFS data) and
a node manager (manages YARN and jobs on this node).

hdfs
yarn

dn
nm

The nodes are in the AT server room.

Kerberos and privilege

The cluster uses Kerberos for authentication. To get privileged access to the cluster, you'll need to authenticate. For this you'll need to know the right machine and account and abbreviation to use. Find them in the table above, then do this:

  • ssh machine
  • nsu account
  • newgrp hadoop
  • export KRB5CCNAME=/tmp/account.cc
  • kinit -k -t /etc/hadoop.abbreviation.keytab abbreviation/${HOSTNAME}

So here's how to get privileged access to the HDFS filesystem:

  • ssh scutter01
  • nsu hdfs
  • newgrp hadoop
  • export KRB5CCNAME=/tmp/hdfs.cc
  • kinit -k -t /etc/hadoop.nn.keytab nn/${HOSTNAME}

There's a handy way to check that the mapping between the service abbreviation (e.g. rm) and its account (e.g. yarn) has been configured correctly:

  • ssh to any Hadoop node
  • hadoop org.apache.hadoop.security.HadoopKerberosName abbreviation/${HOSTNAME}@INF.ED.AC.UK
For example,
[scutter04]: hadoop org.apache.hadoop.security.HadoopKerberosName rm/${HOSTNAME}@INF.ED.AC.UK
Name: rm/scutter04.inf.ed.ac.uk@INF.ED.AC.UK to yarn
[scutter04]: 
So rm maps to the yarn account.

Users

On the exc cluster this is driven by roles and capabilities, so it's automated. A prospective user of the exc cluster needs to gain the hadoop/exc/user capability. Several roles grant that, and you can discover them with e.g.
rfe -xf roles/hadoop
Most student users of the cluster will probably gain a suitable role automatically thanks to the Informatics database and Prometheus.

Making HDFS directories

On the exc cluster this is done by a script called mkhdfs which runs nightly. It ensures that each user with hadoop/exc/user has an HDFS directory. It runs on the namenode of the cluster, and it's installed by the hadoop-cluster-master-hdfs-node.h header.

There's a companion script called rmhdfs. It runs weekly, and looks for and lists those HDFS directories which don't have the capability associated with them. You can then consider deleting those directories at your leisure.

For other clusters, you could either either adapt mkhdfs or you could make an HDFS directory manually. Here's how to do that:

  1. Log in to the namenode with ssh and acquiring privileged access to HDFS.
  2. Then make the HDFS home directory:
     hdfs dfs -mkdir /user/${USER}
     hdfs dfs -chown ${USER} /user/${USER}
     exit
     exit
     logout
    

Jobs

How to run a test job

This is how to check that the cluster is working.

  1. If you don't yet have an HDFS directory, here's how to make one.
  2. Now ssh to the YARN master node:
     ssh scutter02
  3. Put some files into your HDFS dir. These will act as input for the test job:
     hdfs dfs -put $HADOOP_PREFIX/etc/hadoop input
  4. List your files to check that they got there:
     hdfs dfs -ls input
  5. If you have already run the job and you want to rerun it, remove the output dir which the job makes:
     hdfs dfs -rm -r output
  6. Now submit the test job:
     hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar grep input output 'dfs[a-z.]+'
    You should see lots of messages about the job's progress. The job should finish within a minute or two.
  7. Once it's finished, transfer the job's output from HDFS:
     hdfs dfs -get output
  8. ... and take a look at what the job did:
     cd output
     ls
    You should see two files - an empty file called _SUCCESS and a file with a few word counts in it called part-r-00000. If you don't see _SUCCESS then the job didn't work.

What jobs are running?

This command needs privilege, see above. It lists the jobs which are currently running on the cluster.
mapred job -list

Checking the log files

Hadoop keeps comprehensive and informative log files. They're worth checking when you're doing anything with Hadoop, or when something seems to be wrong, or just to check that things are OK. Here's where to find them:

Component Log directorySorted ascending Host
HDFS namenode /disk/scratch/hdfsdata/hadoop/logs The namenode (the master HDFS host)
HDFS datanode /disk/scratch/hdfsdata/hadoop/logs All the compute nodes
Job History Server /disk/scratch/mapred/logs The job history server host
YARN resource manager /disk/scratch/yarn/logs The resource manager (the master YARN host)
YARN node manager /disk/scratch/yarn/logs All the compute nodes

For hostnames see #Nodes.

Nodes

How to list the nodes

There are several configuration files which list the cluster nodes. To find them first ssh to any cluster node, then go to the Hadoop configuration directory:
cd $HADOOP_CONF_DIR
The nodes are named in these files:
File Contains
masters The cluster's master servers. For a simple cluster this would just be the HDFS namenode and the YARN resource master.
slaves The slave nodes of the cluster.
exclude Those slaves which are currently excluded from the cluster.
hosts All the nodes (masters + slaves).

Which host does HDFS think is the namenode?

hdfs getconf -namenodes
Does YARN know the state of the nodes?
yarn node -list -all

Removing a node from the cluster

Here's how to remove a node from the cluster. You might need to do this if a machine has hardware trouble, or if you want to upgrade its firmware or its software, for example. You can only do this with a slave node - not one of the two master nodes.
  1. Add this to the bottom of the node's LCFG file:
    !hadoop.excluded   mSET(true)
  2. HDFS has to decommission the node (i.e. move its share of the HDFS data to other nodes):
    • Login to the namenode.
    • Acquire hdfs privilege (see above).
    • Tell the namenode to reconsider which nodes it should be using:
      hdfs dfsadmin -refreshNodes
    • Look at the namenode log. Wait for it to announce Decommissioning complete for node and the node's IP address.
    • You can also check that HDFS reports that the node's Decommission Status has changed from Normal to Decommissioned:
      hdfs dfsadmin -report
      This means that the node's share of the HDFS data has been copied off onto other nodes.
  3. YARN has to decommission the node.
    • This should work from 28 November 2019.
    • Login to the resource manager (the YARN master) node.
    • Acquire privilege over the yarn resource manager.
    • yarn rmadmin -refreshNodes
    • Having done this, a list of the nodes should show your decommissioned host as DECOMMISSIONED:
      yarn node -list -all

Re-adding an excluded node to the cluster

Remove the hadoop.excluded resource that you added in Removing a node from the cluster then run through the same procedure as you did there.

Documentation

The manuals for this release are at hadoop.apache.org/docs/r2.9.2/. They're listed down the left hand side of the page. There are a lot of them, but quite a few just document one concept or one optional extension (there are a lot of these too).

systemd services

Hadoop is started and stopped using systemd services (which are configured by LCFG).
Hadoop component systemd service Host
HDFS namenode hadoop-namenode.service The namenode (the master HDFS host)
HDFS datanode hadoop-datanode.service All the compute nodes
YARN resource manager hadoop-resourcemanager.service The resource manager (the master YARN host)
YARN node manager hadoop-nodemanager.service All the compute nodes
Job History Server hadoop-mapred.service The job history server host
These can be queried, started and stopped using systemctl in the usual way. For example:
# systemctl status hadoop-nodemanager
 hadoop-nodemanager.service - The hadoop nodemanager daemon
   Loaded: loaded (/etc/systemd/system/hadoop-nodemanager.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-09-19 10:16:33 BST; 1 weeks 4 days ago
 Main PID: 4573 (java)
   CGroup: /system.slice/hadoop-nodemanager.service
           └─4573 /usr/lib/jvm/java-1.8.0-sun/bin/java -Dproc_nodemanager -Xmx4000m -Dhadoop.log.dir=/disk/scratch/yarn/logs -Dya...

How to make a new cluster

Mostly, you'll just need to copy the LCFG config which sets up the exc cluster; but there are a few manual steps too. You'll need to make a header, a namenode, a resource manager and a bunch of slave nodes. Once you've made your new cluster, and you've checked that the log files and systemd services look OK, don't forget to run a test job to check that your cluster works.

Hardware resources

Note that the YARN node manager - which runs on each slave, and matches up jobs with hardware resources - automatically determines what hardware resources are available. If you don't give the slaves enough hardware, the node managers will decide that jobs can't run! Even if you're making a little play cluster on VMs, you'll need to give each slave several CPUs. If you don't, not even a wee diddy test job will run. In tests, slaves with 3 VCPUs and 4GB memory were sufficient for the test job.

Make the header

Pick a one-word name for your cluster. These instructions will use the name dana. Make a new header in subversion for your cluster. In our example we'll make live/hadoop-dana-cluster.h.
  1. Check out the live SubversionRepository and cd to the include/live directory.
  2. svn copy hadoop-exc-cluster.h hadoop-dana-cluster.h
  3. Edit it appropriately - perhaps like this:
    #ifndef LIVE_HADOOP_DANA_CLUSTER
    #define LIVE_HADOOP_DANA_CLUSTER
    #define HADOOP_CLUSTER_NAME dana
    #define HADOOP_CLUSTER_HDFS_MASTER dana01.inf.ed.ac.uk
    #define HADOOP_CLUSTER_YARN_MASTER dana02.inf.ed.ac.uk
    #define HADOOP_CLUSTER_KERBEROS
    #endif /* LIVE_HADOOP_DANA_CLUSTER */
  4. Commit it with svn ci -m "Header to configure the dana Hadoop cluster" hadoop-dana-cluster.h

Make a namenode

  1. Add your cluster's header to the profile of the machine that'll be the new cluster's namenode. Our example header is live/hadoop-dana-cluster.h.
  2. Below it, add the node type header, in this case probably dice/options/hadoop-cluster-master-hdfs-node.h
  3. Let LCFG make the machine's new profile and wait for it to reach the machine.
  4. ssh onto the machine.
  5. Acquire hdfs namenode privilege as described above.
  6. hdfs namenode -format
  7. In a separate session, make a directory then format the HDFS filesystem:
    $ nsu
    # mkdir /disk/scratch/hdfsdata/hadoop/namenode
    # chown hdfs:hadoop /disk/scratch/hdfsdata/hadoop/namenode
    # systemctl restart hadoop-namenode
  8. Back in the session with hdfs namenode privilege, build the base filesystem in HDFS.
     hdfs dfs -mkdir /user
     hdfs dfs -mkdir /tmp
     hdfs dfs -chmod 1777 /tmp
     hdfs dfs -mkdir /tmp/hadoop-yarn
     hdfs dfs -mkdir /tmp/hadoop-yarn/staging
     hdfs dfs -chmod 1777 /tmp/hadoop-yarn/staging
     hdfs dfs -mkdir /tmp/hadoop-yarn/staging/history
     hdfs dfs -mkdir /tmp/hadoop-yarn/staging/history/done_intermediate
     hdfs dfs -chmod 1777 /tmp/hadoop-yarn/staging/history/done_intermediate
     hdfs dfs -chown -R mapred:hadoop /tmp/hadoop-yarn/staging
     hdfs dfs -ls /
     exit
     exit
    
  9. And to support distcp:
     nsu
     cd /disk/scratch/hdfsdata
     mkdir cache
     chown hdfs:hdfs cache
     chmod go+wt cache
     exit
    
Here are a couple of ways to test that it's successfully up and running:

Make a resource manager

  1. As for the namenode, first add the cluster's LCFG header to the profile of the machine that'll be the resource manager. Our example header is live/hadoop-dana-cluster.h.
  2. Below it, add the node type header, in this case probably dice/options/hadoop-cluster-master-yarn-node.h.
  3. Let LCFG make the machine's new profile and wait for it to reach the machine.
  4. Reboot the machine once or twice.
Here are a couple of ways to test that it's successfully up and running:

Make a slave node

The rest of the hosts in the cluster will all be slave nodes. Here's how to make one:
  1. As for the namenode, first add the cluster's LCFG header to the profile of the machine that'll be a slave node. Our example header is live/hadoop-dana-cluster.h.
  2. Below it, add the node type header, in this case probably dice/options/hadoop-cluster-slave-node.h.
  3. Let LCFG make the machine's new profile and wait for it to reach the machine.
  4. Reboot the machine once or twice.
Here are a couple of ways to test that it's successfully up and running:

Topic revision: r53 - 09 Dec 2019 - 21:00:37 - ChrisCooke
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies