Sunday, August 30, 2015

Installing Mahout on Spark 1.4.1

Installing Mahout and Spark

In this blog I will describe the step to install Mahout with Apache Spark 1.4.1 (latest version). Also list out the possible Error and remedies.

Installing Mahout & Spark on your local machine

1) Download Apache Spark 1.4.1 and unpack the archive file

2) Change to the directory where you unpacked Spark and type sbt/sbt assembly to build it

3) Make sure right version of maven (3.3) installed in your system. If not install mvn before build Mahout

4) Create a directory for Mahout somewhere on your machine, change to there and checkout the master branch of Apache Mahout from GitHub git clone mahout

5) Change to the mahout directory and build mahout using mvn -DskipTests clean install

Starting Mahout's Spark shell

1) Goto the directory where you unpacked Spark and type sbin/ to locally start Spark

2) Open a browser, point it to http://localhost:8080/ to check whether Spark successfully started. Copy the url of the spark master at the top of the page (it starts with spark://)

3) Define the following environment variables:

export MAHOUT_HOME=[directory into which you checked out Mahout]

export SPARK_HOME=[directory where you unpacked Spark]

export MASTER=[url of the Spark master]

4) Finally, change to the directory where you unpacked Mahout and type bin/mahout spark-shell, you should see the shell starting and get the prompt mahout>

In next blog will discuss the possibility of Error while installing Mahout with solution.

Next : Resolved issues - Installing Mahout 0.11.0 with Saprk 1.4.1
Post a Comment