Just to give a background,
MAPREDUCE-6514 came up while analyzing issue in MAPREDUCE-6513.
During our analysis of the issue, there were 2 areas which we found susceptible to problems.
One was decreasing and updating container requests when we clear pending reduce requests, something which has been handled in
And other one was whether we should ramp up reducers at all if there are hanging maps since a certain period of time.
We actually wanted to take feedback of community members who have worked extensively in MapReduce on these potential issues.
Refer to comments below(made on
comment1 , comment2
Basically this point got lost in between as we went ahead with rescheduling map requests with priority 5 in
But I just thought to put it out there again.
In RMContainerAllocator#scheduleReduces we ramp up reduces. The calculations in this method are such that this is done too aggressively with default configurations.
Configuration of yarn.app.mapreduce.am.job.reduce.rampup.limit has the default value of 0.5. And if headroom is limited(like in
MAPREDUCE-6513 scenario) i.e. barely enough to launch one mapper/reducer, because of this config value, it thinks that there is sufficient room to launch one more mapper and therefore there is no need to ramp down and reducers are ramped up. However, if this continues forever then this does not seem correct.
Should we really be ramping up if we have hanging map requests irrespective of configuration value of reduce rampup limit ?
We can probably use configuration introduced in
MAPREDUCE-6302 to determine if maps are hanging(i.e. maps are stuck in scheduled state) and do not ramp up reduces if maps are hanging for a while. The config value of how long to wait however would depend on the kind of job being run though.
We also have
MAPREDUCE-6541 as well which adjusts headroom received from RM to find out if we have enough resources for a map task to run.