One of the questions we run into rather commonly is "how to start a Spark application from my Java/Scala program?". There currently isn't a good answer to that:
- Instantiating SparkContext has limitations (e.g., you can only have one active context at the moment, plus you lose the ability to submit apps in cluster mode)
- Calling SparkSubmit directly is doable but you lose a lot of the logic handled by the shell scripts
- Calling the shell script directly is doable, but sort of ugly from an API point of view.
I think it would be nice to have a small library that handles that for users. On top of that, this library could be used by Spark itself to replace a lot of the code in the current shell scripts, which have a lot of duplication.