Monday, August 31, 2015

Installing Mahout with Apache Spark 1.4.1 : Issues and Solution

Installing Mahout with Apache Spark 1.4.1 : Issues and Solution

In this blog I will discuss the possible error you may get during the installation with how to resolve those.

The Error which I listed here based on the sequence which i got during my installation.

Cannot find Spark class path. Is 'SPARK_HOME' set?

cd $MAHOUT_HOME

bin/mahout spark-shell

Got error Cannot find Spark class path. Is 'SPARK_HOME' set?

Solution
Issue is in bin/mahout file , its point to compute-classpath.sh under $SPARK_HOME/bin dir. But in my $SPARK_HOME/bin i didn't find any such a file.

Add compute-classpath.sh under $SPARK_HOME/bin dir.

In my case I just copied it from older version i.e spark1.1


ERROR: Could not find mahout-examples-*.job

cd $MAHOUT_HOME

bin/mahout spark-shell

ERROR: Could not find mahout-examples-*.job in /media/bdalab/bdalab/sw/mahout or /media/bdalab/bdalab/sw/mahout/examples/target, please run 'mvn install' to create the .job file

Solution
set MAHOUT_LOCAL variable to true, to avoid the error.

export MAHOUT_LOCAL=true


Error: Could not find or load main class org.apache.mahout.driver.MahoutDriver

cd $MAHOUT_HOME

bin/mahout spark-shell

MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.

MAHOUT_LOCAL is set, running locally

Error: Could not find or load main class org.apache.mahout.driver.MahoutDriver

Solution
It indicate that need to install mahout driver.

root@solai[bin]# mvn -DskipTests -X clean install

[INFO] Scanning for projects...

[INFO] ------------------------------

[ERROR] BUILD FAILURE

[INFO] ---------------------------------

[INFO] Unable to build project '/media/bdalab/bdalab/sw/mahout/pom.xml; it requires Maven version 3.3.3
Downloaded Latest version of Maven 3.3.3 from repository and unpack it. Run previous command from the Latest Maven bin,

root@solai[bin]# $MAVEN_HOME/bin/mvn -DskipTests -X clean install

org.apache.maven.enforcer.rule.api.EnforcerRuleException: Detected JDK Version: 1.8.0-60 is not in the allowed range [1.7,1.8).
then I have change Java 1.8.06 to 1.7. Now i got this error
root@solai[bin]# $MAVEN_HOME/bin/mvn -DskipTests -X clean install

[INFO] Mahout Build Tools ..... SUCCESS [02:42 min]

[INFO] Apache Mahout ..... SUCCESS [ 0.041 s]

[INFO] Mahout Math ......FAILURE [01:45 min]

[INFO] Mahout HDFS ........ SKIPPED

[INFO] Mahout Map-Reduce ..... SKIPPED

[INFO] Mahout Integration ..... SKIPPED

[INFO] Mahout Examples .........SKIPPED

[INFO] Mahout Math Scala bindings ..... SKIPPED

[INFO] Mahout H2O backend ...... SKIPPED

[INFO] Mahout Spark bindings ..... SKIPPED

[INFO] Mahout Spark bindings shell ..... SKIPPED

[INFO] Mahout Release Package ..... SKIPPED

Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact org.apache.maven:maven-core:jar:2.0.6 from/to central (https://repo.maven.apache.org/maven2): GET request of: org/apache/maven/maven-core/2.0.6/maven-core-2.0.6.jar from central failed
I thought error caused because of the networking issues.
running the same command again,
As i guessed installation completed successfully.

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"
After succefull installation I was trying to get mahout>

cd $MAHOUT_HOME

bin/mahout spark-shell

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"

Solution
export JAVA_TOOL_OPTIONS="-Xmx2048m -XX:MaxPermSize=1024m -Xms1024m"

No comments: