Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
When using mesos with docker and marathon, it would be nice to be able to make spark-submit deployable on marathon and have that download a jar from HDFS instead of having to package the jar with the docker.
$ docker run -it docker.example.com/spark:latest /usr/local/spark/bin/spark-submit --class com.example.spark.streaming.EventHandler hdfs://hdfs/tmp/application.jar Warning: Skip remote jar hdfs://hdfs/tmp/application.jar. java.lang.ClassNotFoundException: com.example.spark.streaming.EventHandler at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:173) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:639) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Although I'm aware that we can run in cluster mode with mesos, we've already built some nice tools surrounding marathon for logging and monitoring.
Code in question:
https://github.com/apache/spark/blob/132718ad7f387e1002b708b19e471d9cd907e105/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L723-L736
Attachments
Issue Links
- breaks
-
SPARK-21714 SparkSubmit in Yarn Client mode downloads remote files and then reuploads them again
- Resolved
- is duplicated by
-
SPARK-20860 Make spark-submit download remote files to local in client mode
- Resolved
- is related to
-
SPARK-16627 --jars doesn't work in Mesos mode
- Closed
- links to