Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.0.0
-
None
-
Ubuntu precise, on YARN (CDH 5.1.0)
Description
Running on YARN, the only way to specify the number of executors seems to be on the command line of spark-submit, via the --num-executors switch.
In many cases this is too early. Our Spark app receives some cmdline arguments which determine the amount of work that needs to be done - and that affects the number of executors it ideally requires. Ideally, the Spark context configuration would support specifying this like any other config param.
Our current workaround is a wrapper script that determines how much work is needed, and which itself launches spark-submit with the number passed to --num-executors - it's a shame to have to do this.