Wednesday, November 13, 2013

Installing Hadoop 2.x.x – Single-node cluster configuration

Installing Hadoop 2.x.x – Single-node cluster configuration
Environment
OS : Debin / BOSS / Ubuntu
Hadoop : Hadoop 2.2.0

find here Hadoop 1.x.x comman mistake while installation

  1. Prerequisites:
    1. Java 6 or above need to be installed
      Ensure that JDK had been already installed in your machine. Otherwise install JDK.
      Download and extract the jdk1.* and extartct the same.
      root@solaiv[~]#vi /etc/profile
      Add : JAVA_HOME= /opt/jdk1.6.0_18
      Append : PATH = “...:$JAVA_HOME/bin”
      ADD : export JAVA_HOME
      Run /etc/profile for reflecting the changes and check the Java version
      root@solaiv[~]#. /etc/profile (or) source /etc/profile
      root@solaiv[~]# java --version
    1. Create dedicated user/group for hadoop. (optional)
      Create user, create group and add the user to the group.
      root@solaiv[~]#createuser hduser
      root@solaiv[~]#addgroup hadoop
      root@solaiv[~]#adduser --ingroup hadoop hduser
      root@solaiv[~]#su hduser
    1. Password less SSH configuration for localhost, later will do for salve (optional, if we didn't do this then have to provide password for each process to start by ./start-*.sh)
      generate an SSH key for the hduser user. Then Enable password less SSH access to your local machine with this newly created key.
      hduser@solaiv[~]#ssh-keygen -t rsa -P ""
      hduser@solaiv[~]#cat /home/hduser/.ssh/id_rsa.pub >> /home/hduser/.ssh/authorized_keys
      hduser@solaiv[~]#ssh localhost
  1. Steps to install Hadoop 2.x.x
    1. Download Hadoop 2.x.x
    2. Extract the hadoop-2.2.0 move to /opt/hadoop-2.2.0
    3. Add the follwing lines into .bashrc file
      hduser@solaiv[~]#cd ~
      hduser@solaiv[~]#vi .bashrc
                   copy and paste following line at end of the file
      #copy start here
      export HADOOP_HOME=/opt/hadoop-2.2.0
      export HADOOP_MAPRED_HOME=$HADOOP_HOME 
      export HADOOP_COMMON_HOME=$HADOOP_HOME 
      export HADOOP_HDFS_HOME=$HADOOP_HOME 
      export YARN_HOME=$HADOOP_HOME 
      export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
      export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop 
      #copy end here
  1. Modify hadoop environment file
    1. Add JAVA_HOME to libexec/hadoop-config.sh at beginning of the file
      hduser@solaiv[~]#vi /opt/hadoop-2.2.0/libexec/hadoop-config.sh
      ….
      export JAVA_HOME=/usr/local/jdk1.6.0_18
      ….
    2. Add JAVA_HOME to hadoop/hadoop-env.sh at beginning of the file
      hduser@solaiv[~]#vi /opt/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
      ….
      export JAVA_HOME=/usr/local/jdk1.6.0_18
      ….
    3. Check Hadoop installation
      hduser@solaiv[~]#cd /opt/hadoop-2.2.0/bin
      hduser@solaiv[bin]#./hadoop version
      Hadoop 2.2.0
      ..
      At this point Hadoop installed in your node.
  1. Create folder for tmp,namenode and datanode
      hduser@solaiv[~]#mkdir -p $HADOOP_HOME/tmp

  1. Hadoop Configuration
    Add the properties in following hadoop configuration file which is availabile under $HADOOP_CONF_DIR
    1. core-site.xml
      hduser@solaiv[~]#cd /opt/hadoop-2.2.0/etc/hadoop
      hduser@solaiv[hadoop]#vi core-site.xml
            #Paste following between <configuration>  tag
      <property>
          <name>fs.default.name</name>
          <value>hdfs://localhost:9000</value>
        </property>
        <property>
          <name>hadoop.tmp.dir</name>
          <value>/opt/hadoop-2.2.0/tmp</value>
        </property>
    1. hdfs-site.xml
      hduser@solaiv[hadoop]#vi hdfs-site.xml
             #Paste following between <configuration> tag
      <property>
      <name>dfs.replication</name>
      <value>1</value>
      </property> 
        <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:/app/hadoop2/namenode</value>
      </property> 
        <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:/app/hadoop2/datanode</value>
      </property>  
      <property>
      <name>dfs.permissions</name>
      <value>false</value>
      </property>
    1. mapred-site.xml
      hduser@solaiv[hadoop]#vi mapred-site.xml
            #Paste following between <configuration> tag
      <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
      </property>
    1. yarn-site.xml
      hduser@solaiv[hadoop]#vi yarn-site.xml
      #Paste following between tag
      <property>
          <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
        </property>
        <property>
          <name>yarn.nodemanager.aux- services.mapreduce.shuffle.class</name>
          <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
          <name>yarn.resourcemanager.resource- tracker.address</name>
          <value>localhost:8025</value>
        </property>
        <property>
          <name>yarn.resourcemanager.scheduler.address</name>
          <value>localhost:8030</value>
        </property>
        <property>
          <name>yarn.resourcemanager.address</name>
          <value>localhost:8040</value>
        </property>

  1. Format the namenode
      root@boss:/opt/hadoop-2.2.0/bin#cd /opt/hadoop-2.2.0/bin
      root@boss:/opt/hadoop-2.2.0/bin# ./hadoop namenode -format

  2. Start Hadoop services
      root@boss:/opt/hadoop-2.2.0/bin# cd /opt/hadoop-2.2.0/sbin/
      root@boss:/opt/hadoop-2.2.0/sbin# ./start-dfs.sh
      root@boss:/opt/hadoop-2.2.0/sbin# jps
      21422 Jps
      21154 DataNode
      21070 NameNode
      21322 SecondaryNameNode
      root@boss:/opt/hadoop-2.2.0/sbin# ./start-yarn.sh
      root@boss:/opt/hadoop-2.2.0/sbin# jps
      21563 NodeManager
      21888 Jps
      21154 DataNode
      21070 NameNode
      21322 SecondaryNameNode
      21475 ResourceManager 

10 comments:

sundara rami reddy said...

I recently came across your blog on hadoop and have been reading along. I thought I would leave my first comment. I don’t know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.
Hadoop Training in hyderabad

sundara rami reddy said...

I recently came across your blog on hadoop and have been reading along. I thought I would leave my first comment. I don’t know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.
Hadoop Training in hyderabad

jack wilson said...

Congratulations guys, quality information you have given!!!..Its really useful blog. Thanks for sharing this useful information..

Hadoop Training in Chennai

Roshini Balu said...

I appreciate your effort, your way of explanation was good at the same time, i enjoyed your blog its really useful keep updates may be it helpful to the beginners as well as mine...
Hadoop training chennai

Roshini Balu said...

Thanks for sharing a worthy info'keep blogging...
sap institutes in Chennai

Jhon anderson said...

I see this content as a Unique and very informative article. Impressive article like this may help many like me in finding the best Hadoop Training in Chennai and there finding the best hadoop training institute in chennai

varshini devi said...

Thanks for delivering informative post to my knowledge...
Informatica training in chennai

surangacloud said...

Nice article i was really impressed by seeing this article, it was very interesting and it is very useful for Learners..
Cloud Computing Training in chennai | Cloud Computing Training chennai | Cloud Computing Course in chennai | Cloud Computing Course chennai

Suranka VMware said...

I gathered a lot of information through this article.Every information is easy to undestandable and explaining easily.Thanks!

VMWare course chennai | VMWare certification in chennai | VMWare certification chennai

Manipriyan said...

Thanks for sharing

AWS Training in Chennai

Blueprism Training in Chennai

Uipath Training in Chennai