Thursday, August 29, 2013

Executing Apache Pig script from Node.js

my previous post help to create mongodb - hadoop connector.

once you have done,
  •  put mongo-hadoop-core_1.1.2-1.1.0 connector  into $HADOOP_HOME/lib
  •  Download latest version of MongoDB java driver and put into $HADOOP_HOME/lib 
in both case the node script i.e runPig.js file should  be in $PIG_HOME/bin

method 1
root@boss:/opt/bigdata/pig-0.11.1/bin>vi runPig.js

var spawn = require('child_process').spawn;
var runPig = spawn('pig',['cntWthr.pig']);
runPig.stdout.on('data',function(data){
console.log('stdout : '+data);
});
runPig.stderr.on('data',function(data){
console.log('stderr : '+data + " process Home : "+process.env.HOME);
});  

root@boss:/opt/bigdata/pig-0.11.1/bin>node runPig.js

method 2

root@boss:/opt/bigdata/pig-0.11.1/bin>vi runPig.js
var sys = require('sys');
var exec = require('child_process').exec;
function puts(error, stdout, stderr) { sys.puts(stdout) }
exec("pig -f cntWthr.pig", puts);

root@boss:/opt/bigdata/pig-0.11.1/bin>node runPig.js

method 2 will not show any log in console, but method 1 will show the log as it execute from pig environment.

1 comment:

rafig said...

hello solai.

could u solved the sqoop issue. also give me sample code for storing back to rdbms. i.e no-sql - > hadoop -> rdbms