Wednesday, November 13, 2013

Installing Hadoop 2.x.x – Single-node cluster configuration

Installing Hadoop 2.x.x – Single-node cluster configuration
Environment
OS : Debin / BOSS / Ubuntu
Hadoop : Hadoop 2.2.0

find here Hadoop 1.x.x comman mistake while installation

  1. Prerequisites:
    1. Java 6 or above need to be installed
      Ensure that JDK had been already installed in your machine. Otherwise install JDK.
      Download and extract the jdk1.* and extartct the same.
      root@solaiv[~]#vi /etc/profile
      Add : JAVA_HOME= /opt/jdk1.6.0_18
      Append : PATH = “...:$JAVA_HOME/bin”
      ADD : export JAVA_HOME
      Run /etc/profile for reflecting the changes and check the Java version
      root@solaiv[~]#. /etc/profile (or) source /etc/profile
      root@solaiv[~]# java --version
    1. Create dedicated user/group for hadoop. (optional)
      Create user, create group and add the user to the group.
      root@solaiv[~]#createuser hduser
      root@solaiv[~]#addgroup hadoop
      root@solaiv[~]#adduser --ingroup hadoop hduser
      root@solaiv[~]#su hduser
    1. Password less SSH configuration for localhost, later will do for salve (optional, if we didn't do this then have to provide password for each process to start by ./start-*.sh)
      generate an SSH key for the hduser user. Then Enable password less SSH access to your local machine with this newly created key.
      hduser@solaiv[~]#ssh-keygen -t rsa -P ""
      hduser@solaiv[~]#cat /home/hduser/.ssh/id_rsa.pub >> /home/hduser/.ssh/authorized_keys
      hduser@solaiv[~]#ssh localhost
  1. Steps to install Hadoop 2.x.x
    1. Download Hadoop 2.x.x
    2. Extract the hadoop-2.2.0 move to /opt/hadoop-2.2.0
    3. Add the follwing lines into .bashrc file
      hduser@solaiv[~]#cd ~
      hduser@solaiv[~]#vi .bashrc
                   copy and paste following line at end of the file
      #copy start here
      export HADOOP_HOME=/opt/hadoop-2.2.0
      export HADOOP_MAPRED_HOME=$HADOOP_HOME 
      export HADOOP_COMMON_HOME=$HADOOP_HOME 
      export HADOOP_HDFS_HOME=$HADOOP_HOME 
      export YARN_HOME=$HADOOP_HOME 
      export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
      export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop 
      #copy end here
  1. Modify hadoop environment file
    1. Add JAVA_HOME to libexec/hadoop-config.sh at beginning of the file
      hduser@solaiv[~]#vi /opt/hadoop-2.2.0/libexec/hadoop-config.sh
      ….
      export JAVA_HOME=/usr/local/jdk1.6.0_18
      ….
    2. Add JAVA_HOME to hadoop/hadoop-env.sh at beginning of the file
      hduser@solaiv[~]#vi /opt/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
      ….
      export JAVA_HOME=/usr/local/jdk1.6.0_18
      ….
    3. Check Hadoop installation
      hduser@solaiv[~]#cd /opt/hadoop-2.2.0/bin
      hduser@solaiv[bin]#./hadoop version
      Hadoop 2.2.0
      ..
      At this point Hadoop installed in your node.
  1. Create folder for tmp,namenode and datanode
      hduser@solaiv[~]#mkdir -p $HADOOP_HOME/tmp

  1. Hadoop Configuration
    Add the properties in following hadoop configuration file which is availabile under $HADOOP_CONF_DIR
    1. core-site.xml
      hduser@solaiv[~]#cd /opt/hadoop-2.2.0/etc/hadoop
      hduser@solaiv[hadoop]#vi core-site.xml
            #Paste following between <configuration>  tag
      <property>
          <name>fs.default.name</name>
          <value>hdfs://localhost:9000</value>
        </property>
        <property>
          <name>hadoop.tmp.dir</name>
          <value>/opt/hadoop-2.2.0/tmp</value>
        </property>
    1. hdfs-site.xml
      hduser@solaiv[hadoop]#vi hdfs-site.xml
             #Paste following between <configuration> tag
      <property>
      <name>dfs.replication</name>
      <value>1</value>
      </property> 
        <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:/app/hadoop2/namenode</value>
      </property> 
        <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:/app/hadoop2/datanode</value>
      </property>  
      <property>
      <name>dfs.permissions</name>
      <value>false</value>
      </property>
    1. mapred-site.xml
      hduser@solaiv[hadoop]#vi mapred-site.xml
            #Paste following between <configuration> tag
      <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
      </property>
    1. yarn-site.xml
      hduser@solaiv[hadoop]#vi yarn-site.xml
      #Paste following between tag
      <property>
          <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
        </property>
        <property>
          <name>yarn.nodemanager.aux- services.mapreduce.shuffle.class</name>
          <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
          <name>yarn.resourcemanager.resource- tracker.address</name>
          <value>localhost:8025</value>
        </property>
        <property>
          <name>yarn.resourcemanager.scheduler.address</name>
          <value>localhost:8030</value>
        </property>
        <property>
          <name>yarn.resourcemanager.address</name>
          <value>localhost:8040</value>
        </property>

  1. Format the namenode
      root@boss:/opt/hadoop-2.2.0/bin#cd /opt/hadoop-2.2.0/bin
      root@boss:/opt/hadoop-2.2.0/bin# ./hadoop namenode -format

  2. Start Hadoop services
      root@boss:/opt/hadoop-2.2.0/bin# cd /opt/hadoop-2.2.0/sbin/
      root@boss:/opt/hadoop-2.2.0/sbin# ./start-dfs.sh
      root@boss:/opt/hadoop-2.2.0/sbin# jps
      21422 Jps
      21154 DataNode
      21070 NameNode
      21322 SecondaryNameNode
      root@boss:/opt/hadoop-2.2.0/sbin# ./start-yarn.sh
      root@boss:/opt/hadoop-2.2.0/sbin# jps
      21563 NodeManager
      21888 Jps
      21154 DataNode
      21070 NameNode
      21322 SecondaryNameNode
      21475 ResourceManager 
Post a Comment