what i learnt - Data and Analytics: April 2014

Wednesday, April 30, 2014

Configuring Eclipse for Apache Hadoop 1.x/2.x : Debain/BOSS Operating System

Configure to integrate Eclipse IDE for Apache Hadoop on BOSS/Debian OS
Why Configure Eclipse for Apache Hadoop?

From Eclipse to work on Hadoop files systems (HDFS).

Create New Directory
Upload files to HDFS
Upload Directory to HDFS
Download from HDFS

Write & Execute MapReduce program, which runs on Hadoop cluster

Step to Integrate Eclipse to work on Hadoop Cluster

Hope you have Installed Eclipse, else download & install
Run your Hadoop server, If haven't, setup Hadoop
Download hadoop-eclipse-plugin-1.2.1.jar and place the Jar into Eclipse/plugins (instead of downloading you can also build the plugin jar file yourself using "ant")
Start the eclipse

$ECLIPSE_HOME/eclipse

In Eclipse menu click, Window --> Open Perspective --> Others --> MapReduce
In bottom MapReduce icon click to Add new Hadoop location

Click to Add Hadoop location

Enter MapReduce & HDFS running port

Enter DFS port and MapReduce port

for recall, MapReduce port (9001) specified in $HADOOP_HOME/conf/mapred-site.xml
for recall, HDFS port (9000) specified in $HADOOP_HOME/conf/core-site.xml
Enter the Hadoop user name

Once Hadoop location added, DFS Locations will be seen/displayed in Eclipse Project Explorer window, (Windows-->Show View-->Project Explorer)
Once Hadoop added, DFS Locations will be seen/displayed in Project Explorer window,
Right click DFS location and click to Connect
Once connected successfully, it will display all the DFS Folder.
You can create Directory, Upload files to HDFS location, Download files to local by right click any of the listed Directory.

HDFS File Management commands

Possible Error you may get

ERROR

Error: Call to loaclhost/127.0.0.1:9000 failed on connection exception:java.net:ConnectionException

SOLUTION

Make sure you have all the Hadoop daemons up&running.

Subscribe to: Posts (Atom)