Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
It makes no sense to schedule reduces for a job before its maps have started running. As an initial fix, we will wait until a certain percent have run (likely 10%). In the future it would be good to choose the time to wait based on amount of map output data as well - launching reducers that will mostly be idle is not helpful. Average amount of map output bytes per mapper is easy to compute using counters in JobInProgress.