Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.0.0
-
None
Description
One of the questions we run into rather commonly is "how to start a Spark application from my Java/Scala program?". There currently isn't a good answer to that:
- Instantiating SparkContext has limitations (e.g., you can only have one active context at the moment, plus you lose the ability to submit apps in cluster mode)
- Calling SparkSubmit directly is doable but you lose a lot of the logic handled by the shell scripts
- Calling the shell script directly is doable, but sort of ugly from an API point of view.
I think it would be nice to have a small library that handles that for users. On top of that, this library could be used by Spark itself to replace a lot of the code in the current shell scripts, which have a lot of duplication.
Attachments
Attachments
Issue Links
- is duplicated by
-
SPARK-3733 Support for programmatically submitting Spark jobs
- Resolved
- relates to
-
SPARK-6047 pyspark - class loading on driver failing with --jars and --packages
- Resolved
- links to