@Robert, I think mapreduce.job.reduce.slowstart.completedmaps is related but different. The issue here is not to wait for a % of mappers to totally complete before start allocating containers to reducers, but the issue is to prevent reducers from occupying containers while these containers are still needed by mappers.
@Jason, watching headroom and preempting reducers should be sufficient to address this issue, but this doesn't seem to work in our case. It is using Fifo.
MAPREDUCE-4228 seems to address a bug with the behavior of mapreduce.job.reduce.slowstart.completedmaps, which as I mentioned above is different.
@Sharad, Yes we are seeing this in a customer cluster.