Speculated copies of tasks do not get launched in some cases.
- All the running executors have no CPU slots left to accommodate a speculated copy of the task(s). If the all running executors reside over a set of slow / bad hosts, they will keep the job running for long time
- `spark.task.cpus` > 1 and the running executor has not filled up all its CPU slots. Since the speculated copies of tasks should run on different host and not the host where the first copy was launched.
In both these cases, `ExecutorAllocationManager` does not know about pending speculation task attempts and thinks that all the resource demands are well taken care of. (relevant code)
This adds variation in the job completion times and more importantly SLA misses In prod, with a large number of jobs, I see this happening more often than one would think. Chasing the bad hosts or reason for slowness doesn't scale.
Here is a tiny repro. Note that you need to launch this with (Mesos or YARN or standalone deploy mode) along with `--conf spark.speculation=true --conf spark.executor.cores=4 --conf spark.dynamicAllocation.maxExecutors=100`