Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.1.0
-
None
-
Linux
spark-1.1.0-bin-hadoop2.4.tgz
java version "1.7.0_72"
Java(TM) SE Runtime Environment (build 1.7.0_72-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.72-b04, mixed mode)
Description
Consider trivial test.scala:
import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ object Main { def main(args: Array[String]) { val sc = new SparkContext() sc.stop() } }
When built with sbt and executed using spark-submit target/scala-2.10/test_2.10-1.0.jar, I get the following error:
Spark assembly has been built with Hive, including Datanucleus jars on classpath Error: Cannot load main class from JAR: file:/ha/home/straka/s/target/scala-2.10/test_2.10-1.0.jar Run with --help for usage help or --verbose for debug output
When executed using spark-submit --class Main target/scala-2.10/test_2.10-1.0.jar, it works.
The jar file has correct MANIFEST.MF:
Manifest-Version: 1.0
Implementation-Vendor: test
Implementation-Title: test
Implementation-Version: 1.0
Implementation-Vendor-Id: test
Specification-Vendor: test
Specification-Title: test
Specification-Version: 1.0
Main-Class: Main
The problem is that in org.apache.spark.deploy.SparkSubmitArguments, line 127:
val jar = new JarFile(primaryResource)
the primaryResource has String value "file:/ha/home/straka/s/target/scala-2.10/test_2.10-1.0.jar", which is URI, but JarFile can use only Path. One way to fix this would be using
val uri = new URI(primaryResource) val jar = new JarFile(uri.getPath)