Wednesday, November 13, 2013

Setup Multi Node Hadoop 2.0 cluster configuration


Installing Hadoop 2.x.x – multi-node cluster configuration
Environment
OS : Debin / BOSS / Ubuntu
Hadoop : Hadoop 2.2.0

find here Hadoop 1.x.x comman mistake while installation

find here Hadoop-2.2.0 Single-node cluster setup. Multi-node hadoop cluster setup can be done be either by 1) single-node cluster setup by all the machine and changes the Hadoop configuration files (or) 2) just follow the below step (expect 5.b, 5.c, which is only for master) for both Master and all the slave node.

  1. Prerequisites: ( for both Master and all the slave)
    1. Java 6 or above need to be installed
      Ensure that JDK had been already installed in your machine. Otherwise install JDK.
      Download and extract the jdk1.* and extartct the same.
      root@solaiv[~]#vi /etc/profile
      Add : JAVA_HOME= /usr/local/jdk1.6.0_18
      Append : PATH = “...:$JAVA_HOME/bin”
      ADD : export JAVA_HOME
      Run /etc/profile for reflecting the changes and check the Java version
      root@solaiv[~]#. /etc/profile (or) source /etc/profile
      root@solaiv[~]# java --version

    1. Create dedicated user/group for hadoop. (optional)
      Create user, create group and add the user to the group.
      root@solaiv[~]#createuser hduser
      root@solaiv[~]#addgroup hadoop
      root@solaiv[~]#adduser --ingroup hadoop hduser
      root@solaiv[~]#su hduser

    1. Password less SSH configuration for localhost, later will do for salve (optional, if we didn't do this then have to provide password for each process to start by ./start-*.sh)
      generate an SSH key for the hduser user. Then Enable password less SSH access to your local machine with this newly created key.
      hduser@solaiv[~]#ssh-keygen -t rsa -P ""
      hduser@solaiv[~]#cat /home/hduser/.ssh/id_rsa.pub >> /home/hduser/.ssh/authorized_keys
      hduser@solaiv[~]#ssh localhost

  1. Steps to install Hadoop 2.x.x ( for both Master and all the slave)
    1. Download Hadoop 2.x.x
    2. Extract the hadoop-2.2.0 move to /opt/hadoop-2.2.0
    3. Add the follwing lines into .bashrc file
      hduser@solaiv[~]#cd ~
      hduser@solaiv[~]#vi .bashrc

copy and paste following line at end of the file
      #copy start here
      export HADOOP_HOME=/opt/hadoop-2.2.0
      export HADOOP_MAPRED_HOME=$HADOOP_HOME 
      export HADOOP_COMMON_HOME=$HADOOP_HOME 
      export HADOOP_HDFS_HOME=$HADOOP_HOME 
      export YARN_HOME=$HADOOP_HOME 
      export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
      export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop 
      #copy end here


  1. Modify hadoop environment file ( for both Master and all the slave)
    1. Add JAVA_HOME to libexec/hadoop-config.sh at beginning of the file
      hduser@solaiv[~]#vi /opt/hadoop-2.2.0/libexec/hadoop-config.sh
      ….
      export JAVA_HOME=/usr/local/jdk1.6.0_18
      ….
    2. Add JAVA_HOME to hadoop/hadoop-env.sh at beginning of the file
      hduser@solaiv[~]#vi /opt/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
      ….
      export JAVA_HOME=/usr/local/jdk1.6.0_18
      ….
    3. Check Hadoop installation
      hduser@solaiv[~]#cd /opt/hadoop-2.2.0/bin
      hduser@solaiv[bin]#./hadoop version
      Hadoop 2.2.0
      ..
      At this point Hadoop installed in your node.

  1. Create folder for tmp ( for both Master and all the slave)
      hduser@solaiv[~]#mkdir -p $HADOOP_HOME/tmp

  1. Configuration : Multi-node setup
    1. Add IP address of Master and all Slaves to /etc/hosts ( for both Master and all the slave node)
      Add the association between the hostnames and the IP address for the master and the slaves on all the nodes in the /etc/hosts. Make sure that the all the nodes in the cluster are able to ping to each other.
      hduser@boss:/opt/hadoop-2.2.0/bin#vi /etc/hosts
      10.184.39.67 master
      10.184.36.134 slave
      in my case only one slave, if u have more no.of slave node, name it like slave1, slave2 etc..
    2. Password less ssh from master to slave (Optional, only at Master node)
      hduser@boss:[~]#ssh-keygen -t rsa -P ""
      hduser@boss:[~]#ssh-copy-id -i /home/hduser/.ssh/id_dsa.pub hduser@slave
      root@boss[bin]#ssh slave
    [Note : If you skip this step, then have to provide password for all slave when Master start the process by ./start-*.sh. If you have configured more no.of slave as mentioned in /etc/hosts, repeet the 2nd line of above to all the slaves by hduser@slave1, hduser@slave2 etc.. ]
    1. Add the Slave entries in $HADOOP_CONF_DIR/slaves ( only at Master node )
      Add all the slave entries in slaves file in Master node. This intimating Hadoop that these nodes for running DataNode and NodeManager. If you dont want master to act as DataNode just omit.
      hduser@boss:[~]#vi /opt/hadoop-2.2.0/etc/hadoop/slaves
       slave
      Note : in my case only one slave, if u have more no.of slave node, add all the slave hostname one in line as mentioned in /etc/hosts
  2. Hadoop Configuration ( for both Master and all the slave)
    Add the properties in following hadoop configuration file which is availabile under $HADOOP_CONF_DIR
    1. core-site.xml
      hduser@solaiv[~]#cd /opt/hadoop-2.2.0/etc/hadoop
      hduser@solaiv[hadoop]#vi core-site.xml
    #Paste following between <configuration> tag
      <property>
          <name>fs.default.name</name>
          <value>hdfs://master:9000</value>
        </property>
        <property>
          <name>hadoop.tmp.dir</name>
          <value>/opt/hadoop-2.2.0/tmp</value>
        </property>

    1. hdfs-site.xml
      hduser@solaiv[hadoop]#vi hdfs-site.xml
    #Paste following between <configuration> tag
      <property>
      <name>dfs.replication</name>
      <value>2</value>
       </property> 
        <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:/app/hadoop2/namenode</value>
      </property> 
        <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:/app/hadoop2/datanode</value>
      </property> 
       <property>
      <name>dfs.permissions</name>
      <value>false</value>
      </property>
Note : Here I've only one slave and master so I put replication values as 2, If you have more slave put replication value based on that.
    1. mapred-site.xml
      hduser@solaiv[hadoop]#vi mapred-site.xml
    #Paste following between <configuration> tag
      <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
      </property>

    1. yarn-site.xml
      hduser@solaiv[hadoop]#vi yarn-site.xml
      #Paste following between <configuration> tag
      <property>
          <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce.shuffle</value>
        </property>
        <property>
          <name>yarn.nodemanager.aux- services.mapreduce.shuffle.class</name>
          <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
          <name>yarn.resourcemanager.resource- tracker.address</name>
          <value>master:8025</value>
        </property>
        <property>
          <name>yarn.resourcemanager.scheduler.address</name>
          <value>master:8030</value>
        </property>
        <property>
          <name>yarn.resourcemanager.address</name>
          <value>master:8040</value>
        </property>

  1. Format the namenode ( only at Master node )
      hduser@boss:/opt/hadoop-2.2.0/bin#cd /opt/hadoop-2.2.0/bin
      hduser@boss:/opt/hadoop-2.2.0/bin# ./hadoop namenode -format

  1. Admintaring Hadoop - Start & Stop (Only at Master node)
    just start the process at Master slave node automatically startup.
    1. start-dfs.sh : to start namenode and datanode
    
    
      hduser@boss:[~]# cd /opt/hadoop-2.2.0/sbin
      hduser@boss:[sbin]# ./start-dfs.sh
    check at Master
      hduser@boss:[sbin]#jps
      17675 Jps
      17578 SecondaryNameNode
      17409 NameNode
    check at Salve
      hduser@boss:[sbin]#jps
      9317 Jps
      9250 DataNode

    1. start-yarn.sh : to start resourcemanager and nodemanager
      hduser@boss:[sbin]# ./start-yarn.sh
    check at Master
      hduser@boss:[sbin]#jps
      17578 SecondaryNameNode
      17917 ResourceManager
      17409 NameNode
      18153 Jps
    check at Salve
      hduser@boss:[sbin]#jps
      9317 Jps
      9250 DataNode
      9357 NodeManager

  1. Working on Hadoop multi-node environment
    1. excute this command at master
      hduser@boss:/opt/hadoop-2.2.0/bin# ./hdfs dfs -mkdir -p /user/hadoop2
      hduser@boss:/opt/hadoop-2.2.0/bin# ./hdfs dfs -put /root/Desktop/test.html /user/hadoop2
      hduser@boss:/opt/hadoop-2.2.0/bin# ./hdfs dfs -ls
      Found 1 items
      -rw-r--r-- 2 root supergroup 225 2013-11-11 20:19 /user/hadoop2/test.html
    2. check at slave node

      hduser@boss:/opt/hadoop-2.2.0/bin# ./hdfs dfs -ls user/hadoop2/
      Found 1 items
      -rw-r--r-- 2 root supergroup 225 2013-11-11 20:19 /user/hadoop2/test.html
      hduser@boss:/opt/hadoop-2.2.0/bin# /opt/hadoop-2.2.0/bin# ./hdfs dfs -cat /user/hadoop2/test.html
      test file. Welcome to Hadoop2.2.0 Installation. !!!!!!!!!!!


36 comments:

Karthic said...

Thanks Solai, for the step by step instructions.

Anonymous said...

Awesome post solai.. Please post all the administration activities and how to put large files in hadoop and how it is getting balanced.

Anonymous said...

awesome Karthic ! You helped me a lot and it worked for hadoop 2.3 too.

There were some minor issues i googled and could fi x it. Iam posting it here for the benefit of others.


#1 - It is related to yarn-site.xml

Here is my new yarn-site.xml:



yarn.nodemanager.aux-services
mapreduce_shuffle


yarn.nodemanager.aux-services.mapreduce_shuffle.class
org.apache.hadoop.mapred.ShuffleHandler


yarn.resourcemanager.resource-tracker.address
master:8025


yarn.resourcemanager.scheduler.address
master:8030


yarn.resourcemanager.address
master:8040



#3 - Add this to yarn-env.sh and hadoop-env.sh

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.library.path=$HADOOP_HOME/lib"

solaimurugan.v said...

thanks to all for your comments

Chandran Narasimhan said...

Thanks Solai. I ment awesome solai ! in my previous comment.

Btw, for the password less SSH setup here are the directory permissions:

chmod 750 ~
chmod 700 ~/.ssh
chmod 600 ~/.ssh/*

Where ~ is your hadoop user (hduser) home.

We had a permission to home as 777 and spent hours debugging that issue.

Chandran Narasimhan said...

Instructions to start the mapreduce history server http://canjur.blogspot.com/2014/04/configure-and-run-hadoop-2-mapreduce.html

vishal said...

Hello,
I have setup a cluster, but my datanodes process are not listed under jps command. But the hdfs dfsadmin -report show me all my datanodes.

Also i have tried deleting the datanode directory and reformatting hdfs, doesnt work.

Do you have any solution to this

solaimurugan.v said...

Hi Vishal,
In your case, DataNode does not started gracefully. kindly check your DataNode logs.

vishal said...

hey thanks for reply
This is my datanode log:

java.net.BindException: Problem binding to [0.0.0.0:50010] java.net.BindException: Address already in\
use; For more details see: http://wiki.apache.org/hadoop/BindException
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:


Looks like i am not able to bind the port. I looked up the link, it says some one else might be listening to the port. However no other process is using and when i start-dfs.sh, the java process starts using the ports and when i do stop-dfs.sh, the java process are still listening to the port

vishal said...

Also i tried changidfs.datanode.address 0.0.0.0:50010
dfs.datanode.http.address 0.0.0.0:50075
dfs.datanode.ipc.address 0.0.0.0:50020

t0 50011,50012,50013 respectively . the new java process bind to new ports but cant see the datanode process in the jps command

solaimurugan.v said...

sry late,

try this,


fs.default.name hdfs://0.0.0.0:9000


in core-site.xml







vishal said...

hey thank you for reply,
Well i tried changing core-site.xml as you said, no success. i can also run mapreduce but cant see datanode and nodemanager process.

vishal said...

Hey looks like the start-dfs.sh and stop-dfs.sh are useless here. i wrote my own script for doing it using hadoop-daemon.sh. Works like a charm!!!

Thnaks a lot for help

J Shantz said...

Thanks for the instructions. I had one issue with the setup running on AWS. After originally getting it running, I was getting a "Connection Refused" error between the slave and master and the slave was not doing anything. I ultimately determined the cause of this to be in my /etc/hosts file. On the master, make sure the "master" entry in the hosts file is the local IP address and not 127.0.0.1. Otherwise the yarn service is only available locally and the slave cannot connect to it.

solaimurugan.v said...

Hi Shantz..,

*) Issue "ping" salve command from master. Hope it works well then check hostname of the slave m/c by issuing command "hostname" .

*) now start only nodemanager from datanode/slave

*) still getting error, share your nodemanager log from slave m/c

adil usman said...

Hi , I have followed the tutorial and completed setup of cluster exactly same as said in the tutorial . Now i am trying to run pi example but it is struck at the point as shown below.Can you guide me what am i missing here?



hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5
Number of Maps = 2
Samples per Map = 5
14/08/03 06:35:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
14/08/03 06:35:31 INFO client.RMProxy: Connecting to ResourceManager at VM-002/10.0.1.2:8040
14/08/03 06:35:32 INFO input.FileInputFormat: Total input paths to process : 2
14/08/03 06:35:32 INFO mapreduce.JobSubmitter: number of splits:2
14/08/03 06:35:32 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/08/03 06:35:32 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/08/03 06:35:32 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
14/08/03 06:35:32 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/08/03 06:35:32 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/08/03 06:35:32 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/08/03 06:35:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1407043822091_0001
14/08/03 06:35:33 INFO impl.YarnClientImpl: Submitted application application_1407043822091_0001 to ResourceManager at VM-002/10.0.1.2:8040
14/08/03 06:35:33 INFO mapreduce.Job: The url to track the job: http://VM-002:8088/proxy/application_1407043822091_0001/
14/08/03 06:35:33 INFO mapreduce.Job: Running job: job_1407043822091_0001

Prologic Corporation said...

This is a good article & good site.Thank you for sharing this article. It is help us following categorize:
healthcare, e commerce, programming, it consulting, retail, manufacturing, CRM, digital supply chain management, Delivering high-quality service for your business applications,
Solutions for all Industries,
Getting your applications talking is the key to better business processes,
Rapid web services solutions for real business problems,
Web-based Corporate Document Management System,
Outsourcing Solution,
Financial and Operations Business Intelligence Solution,

Our address:
2002 Timberloch Place, Suite 200
The Woodlands, TX 77380

prologic-corp

sureddy kesav said...

I want how to run wordcount program

solaimurugan v said...


try this

$HADOOP_PREFIX/bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /INPUT_PATH_LOCAL /OUTOUT_DIR_HDFS


note : hadoop-mapreduce-examples-2.5.1.jar, may vary based on the version you have installed.

sureddy kesav said...

I'm using Hadoop2.2.0.
I just followed the steps you published

solaimurugan v said...

this will work.. run the below command on terminal...

$HADOOP_HOME/bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /INPUT_PATH_LOCAL /OUTOUT_DIR_HDFS

sureddy kesav said...


hduser@ubuntu:/opt/hadoop-2.2.0$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/hadoop2/input /user/hadoop2/output

15/03/06 21:30:54 INFO mapreduce.Job: Job job_1425635780130_0009 running in uber mode : false
15/03/06 21:30:54 INFO mapreduce.Job: map 0% reduce 0%
15/03/06 21:30:54 INFO mapreduce.Job: Job job_1425635780130_0009 failed with state FAILED due to: Application application_1425635780130_0009 failed 2 times due to Error launching appattempt_1425635780130_0009_000002. Got exception: java.net.ConnectException: Call From ubuntu/127.0.1.1 to ubuntu:50507 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:
at sun.reflect.GeneratedConstructorAccessor47.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy22.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:249)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:601)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
at org.apache.hadoop.ipc.Client.call(Client.java:1318)
... 9 more
. Failing the application.
15/03/06 21:30:54 INFO mapreduce.Job: Counters: 0

solaimurugan v said...

1) make sure all the daemons are up and running

2) kindly post your /etc/hosts files on both NN and DN

sureddy kesav said...


All started on master and slaves

sureddy kesav said...
This comment has been removed by the author.
sureddy kesav said...

127.0.0.1 localhost
127.0.1.1 ubuntu
172.168.2.178 master
172.168.0.142 slave1
172.168.2.36 slave2
172.168.1.71 slave3
172.168.1.103 slave4

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

solai said...

Remove/comment first two lines from /etc/hosts.

27.0.0.1 localhost
127.0.1.1 ubuntu
Once done, restart namenode daemon by executing the below command. (To know more refer..

http://solaimurugan.blogspot.in/2014/05/step-by-step-instruction-how-startstop.html?m=1
)

sbin/hadoop-daemon.sh start namenode
sbin/hadoop-daemon.sh stop namenode

As well as resources manager. By

sbin/yarn-deamon.sh stop resourcemanager

sbun/yarn-deamon.sh start resourcemanager


Then execute the MR wordcount.

sureddy kesav said...
This comment has been removed by the author.
sureddy kesav said...

sbin/yarn-deamon.sh stop resourcemanager

sbun/yarn-deamon.sh start resourcemanager
these are not working
And
127.0.0.1 localhost
127.0.1.1 ubuntu

After removing these on master and slaves resourcemanager, nodemanager, datanodes

solai said...

Sry, typo error.. Its

sbin/yarn-daemon.sh stop resourcemanager

sbin/yarn-daemon.sh start resourcemanager


sureddy kesav said...

sbin/yarn-daemon.sh stop resourcemanager

sbin/yarn-daemon.sh start resourcemanager


after doing this resourcemanager is not starting

solai said...

Send me the log file.

sureddy kesav said...

It is not accepting my log file

solai said...

Boss copy and past last 50 lines..or where u find error from bottom of file

Haddad Riadh said...

Hello ,Thanks for this post ,please haw i can access to hadoop multi node from remote ,how i can configure http access in nodes and master?
Thanks

solai said...

If you setup multi node, then u can access NANE NODE by "masterIP:50070" from any node even other system which doesn't have hadoop distro. Like wise you cam acceas RESOURCE MANAGER by "masterIP:8088".