Details
-
Question
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
-
Spark version 2.3.0.2.6.5.10-2
[EDIT] AP
Description
I have a Spark Cluster already setup, and this is the environment not in my direct control, but they do allow FAT JARs to be installed with the dependencies. I tried to package my Spark Application with some mahout code for SimilarityAnalysis, added Mahout library in POM file, and they are successfully packaged.
The problem however is that I am getting this error while using existing Spark Context to build Distributed Spark Context for
Mahout
[EDIT]AP:
pom.xml {...} dependency> <groupId>org.apache.mahout</groupId> <artifactId>mahout-math</artifactId> <version>0.13.0</version> </dependency> <dependency> <groupId>org.apache.mahout</groupId> <artifactId>mahout-math-scala_2.10</artifactId> <version>0.13.0</version> </dependency> <dependency> <groupId>org.apache.mahout</groupId> <artifactId>mahout-spark_2.10</artifactId> <version>0.13.0</version> </dependency> <dependency> <groupId>com.esotericsoftware</groupId> <artifactId>kryo</artifactId> <version>5.0.0-RC5</version> </dependency>
Code:
implicit val sc: SparkContext = sparkSession.sparkContext implicit val msc: SparkDistributedContext = sc2sdc(sc) Error: ERROR TaskSetManager: Task 7.0 in stage 10.0 (TID 58) had a not serializable result: org.apache.mahout.math.DenseVector And if I try to build the context using mahoutSparkContext() then its giving me the error that MAHOUT_HOME not found. Code: implicit val msc = mahoutSparkContext(masterUrl = "local", appName = "CooccurrenceDriver") Error: MAHOUT_HOME is required to spawn mahout-based spark jobs
My question is that how do I proceed in this situation? should I have to ask the administrators of the Spark environment to install Mahout library, or is there anyway I can proceed packaging my application as fat JAR.