Friday, October 25, 2013

Replication in MySQL : From slave : unable to connect Master 10.184.0.10

Recently one my colleague was setting up MySQL replication through phpMyAdmin. When connecting Master through Slave, got error like  "unable to connect Master : 10.84.1.20"


Slave : Unable to connect Master : 10.84.1.20


SOLUTION
  I was gone through both Master and Salve my.cnf file. both Master and Slave   bind-address was pointing to the localhost. Modify  bind-address   localhost to 0.0.0.0 was solved the issues. 

from
 bind-address = 127.0.0.1
to
 bind-address = 0.0.0.0





Wednesday, October 23, 2013

Hadoop multi-node cluster configuration : Error & Solution - part-II

        This post continue of part-I, look here for part-I error & solution on Hadoop multi-node cluster setup

ERROR 11)

         Hadoop Master cannot start the Slave with different HADOOP_PATH (or) Slave with different HADOOP_PATH failed to start by Hadoop Master node.

This issue can also answer the question of Master and Salve node running by different Operating System (OS).
Say example, Master running on Ubuntu and Salve running on different OS. One of the Slave running on windows. In this case windows doesn't have the same path as Master HADOOP_PATH running on Ubuntu.

 In my case
    Master node HADOOP_PATH : /opt/hadoop-1.1.0
    Slave Node HADOOP_PATH : /opt/softwares/hadoop-1.1.0
while i'm starting Master node by,
    hduser@solaiv[bin]$start-dfs.sh
it throwing error
    slave: bash: line 0: cd: /opt/hadoop-1.1.0/libexec/..: No such file or directory
    slave: bash: /opt/hadoop-1.1.0/bin/hadoop-daemon.sh: No such file or directory

SOLUTION 11)
         In this case, it is searching the same path as master on slave, but Salve has different path. Create the same user for both Master and Salve node. Create same path (dir) as master on salve and then create the symbolic link for the actual path.
This issue can be solved in two ways.

1) Create same path and then create symbolic link from actual path. (OR)
2) Start Salve daemons manually by  start-dfs.sh (OR)
3) root@solaiv[bin]#./hadoop-daemon.sh start datanode 



ERROR 12)
        Hadoop Master cannot start the Slave with different user (or) Slave with different user failed to start by Hadoop Master node.
        In my another scenario, I tried with set-up Hadoop multi-node cluster with different user for Master and  Slave.
        Say example Master running on Ubuntu with hduser, one client running on Debian with hduser and another Salve runnning on BOSS 5.0 with root user.               
 In my case
    Master : hduser
    Slave1 : hduser
    Salve2 : root
while i'm starting Master node by,
    hduser@solaiv[bin]$start-dfs.sh
both Master and Salve1 running successfully and asking prompt Salve2 password (even password less SSH configured).

Its asking password for hduser@slave2 instead of root@slave2 

SOLUTION 12)

1) Start Salve daemons manually by  start-dfs.sh (OR )
2) root@solaiv[bin]#./hadoop-daemon.sh start datanode  

Both ERROR 11 & 12)
Note : Hadoop framework does not require ssh and that the DataNode and TaskTracker daemons can be started manually on each node. So need not require each salve configure same path. simply ignore the error and start slave daemons manually ( start-dfs.sh and start-mapred.sh). However make sure to include all the slaves (ip/dns) in the master's conf/slave file
                 

Friday, October 18, 2013

Hadoop multi-node cluster configuration : Error & Solution



Here I've listed few error with solution while configuring multi-node Hadoop cluster.

Environment
OS : Debian 7 (wheezy)  / BOSS 5.0 / Ubundu 12.0
Hadoop Version : 1.1.0
ERROR 1)
hduser is not in the sudoers file. This incident will be reported.
hduser@solaiv[softwares]$sudo vi /etc/profile 
[sudo] password for hduser: 
hduser is not in the sudoers file.  This incident will be reported.  
SOLUTION 1)
in Root user terminal
root@solaiv[~]#echo 'hduser ALL=(ALL) ALL' >> /etc/sudoers

ERROR 2) 
jps command not found

hduser@solaiv[softwares]$sudo jps
jps command not found
SOLUTION 2)
jps is availablle unde JAVA_HOME/bin directories. 
hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$ alias jps='/opt/softwares/jdk1.6.0_18/bin/jps'

ERROR 3)
Once I've done configuring Hadoop, checked the running process status by jps command. Only NameNode, Secondarynamenode and Datanode were running not Jobtracker and Tasktracker. Then I opened the Jobtracker log file, the error was 

DJOB TRACKER : Can not start task tracker because java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 127.0.0.1:9001

ERROR 4)
Tasktracker log file, the error was

TASK TRACKER : FATAL org.apache.hadoop.mapred.JobTracker: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 127.0.0.1:9001


SOLUTION 3 & 4)
I was confident that I've done mistake in mapred-site.xml, after going through line by line I figured it out silly mistake. Space between value and xml closing tag
mapred-site.xml file
<property> <name>mapred.job.tracker</name><value>127.0.0.1:9001 </value> </property>
 
# space between 9001 and closing value tag
Just I modified the mapred-site.xml file by,t;

<property><name>mapred.job.tracker</name><value>127.0.0.1:9001</value> </property>

ERROR 5)

Cannot lock storage /app/hadoop/tmp/dfs/name. The directory is already locked.
SOLUTION 5)
A)
Check your dfs.name.dir and dfs.data.dir path in hdfs-site.xml
or B)
stopp all the running Hadoop process and start again
hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$ ./stop-all.sh
..
hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$ ./start-all.sh
or C)
Error 5) exist even after above step, then stop all the process, format the NameNode and start again.

hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$ ./stop-all.sh
..
hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$ hadoop namenode -format
...
hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$ ./start-all.sh

ERROR 6) 

ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. java.io.FileNotFoundException: /app/hadoop/tmp/dfs/name/in_use.lock (Permission denied) NameNode not started
SOLUTION 6)
change the owner to hadoop and ALL permission for dfs.data.dir i.e /app/hadoop/tmp

hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$chown hduser:hadoop -R /app/hadoop/tmp
hduser@solaiv:/opt/softwares/hadoop-1.1.0/bin$ chmod 777 -R /app/hadoop/tmp

ERROR 7)
jps shows only Namenode and secondary Namenode. Then I started datanode by 

hduser@solaiv[bin]$./hadoop datanode ERROR datanode.DataNode: All directories in dfs.data.dir are invalid.
SOLUTION 7)
first delete all contents from temporary folder:


rm -rf /app/hadoop/tmp/
Make sure that dir has right owner and permission /app/hadoop/tmp/
hduser@solaiv[bin]$sudo chown hduser:hadoop -R /app/hadoop/tmp/
hduser@solaiv[bin]$sudo chmod 777 -R /app/hadoop/tmp/
format the namenode:

hduser@solaiv[bin]$./hadoop namenode -format
Start all processes again: 

hduser@solaiv[bin]$./start-all.sh

ERROR 8) 

hduser@solaiv[bin]$./hadoop fs -mkdir solaiv mkdir: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/hduser/solaiv. Name node is in safe mode.
SOLUTION 8)
hduser@solaiv[bin]$./hadoop dfsadmin -safemode leave

ERROR 9)

...............
...............
13/10/08 17:15:24 ERROR datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data: namenode namespaceID = 1580775695; datanode namespaceID = 1494801914
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:397) at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:307) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1644) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1583) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1601) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1727) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1744)13/10/08 17:15:24 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG:
Shutting down DataNode at boss/127.0.0.1
************************************************************/
SOLUTION 9) 
Namenode generates new namespaceID every time you format HDFS.

root@boss[bin]#vi /app/hadoop/tmp/dfs/data/current/VERSION
Manually update the namespaceID = 1494801914 to 1580775695; and save the file

#Mon Oct 14 14:52:19 IST 2013
namespaceID=1580775695
storageID=DS-469635027-127.0.0.1-50010-1376974011263
cTime= 1377582559943
storageType=DATA_NODE
layoutVersion=-32

ERROR 10)
I got this Error after I resolved Error 9)

13/10/08 17:24:33 ERROR datanode.DataNode: java.io.IOException: Datanode state: LV = -32 CTime = 1377582559943 is newer than the namespace state: LV = -32 CTime = 0
SOLUTION 10)
open the file and change the value of variable cTime to 0.

root@boss[bin]#vi /app/hadoop/tmp/dfs/data/current/VERSION
#Mon Oct 14 14:52:19 IST 2013
namespaceID=1580775695
storageID=DS-469635027-127.0.0.1-50010-1376974011263
cTime= 0
storageType=DATA_NODE
layoutVersion=-32