Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
-
None
Description
In RMContainerAllocator#preemptReducesIfNeeded, we simply clear the scheduled reduces map and put these reducers to pending. This is not updated in ask. So RM keeps on assigning and AM is not able to assign as no reducer is scheduled(check logs below the code).
If this is updated immediately, RM will be able to schedule mappers immediately which anyways is the intention when we ramp down reducers.
Scheduler need not allocate for ramped down reducers
This if not handled can lead to map starvation as pointed out in MAPREDUCE-6513
LOG.info("Ramping down all scheduled reduces:" + scheduledRequests.reduces.size()); for (ContainerRequest req : scheduledRequests.reduces.values()) { pendingReduces.add(req); } scheduledRequests.reduces.clear();
2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not assigned : container_1437451211867_1485_01_000215 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign container Container: [ContainerId: container_1437451211867_1485_01_000216, NodeId: hdszzdcxdat6g06u04p:26009, NodeHttpAddress: hdszzdcxdat6g06u04p:26010, Resource: <memory:4096, vCores:1>, Priority: 10, Token: Token { kind: ContainerToken, service: 10.2.33.236:26009 }, ] for a reduce as either container memory less than required 4096 or no pending reduce tasks - reduces.isEmpty=true 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not assigned : container_1437451211867_1485_01_000216 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign container Container: [ContainerId: container_1437451211867_1485_01_000217, NodeId: hdszzdcxdat6g06u06p:26009, NodeHttpAddress: hdszzdcxdat6g06u06p:26010, Resource: <memory:4096, vCores:1>, Priority: 10, Token: Token { kind: ContainerToken, service: 10.2.33.239:26009 }, ] for a reduce as either container memory less than required 4096 or no pending reduce tasks - reduces.isEmpty=true
Attachments
Attachments
Issue Links
- is broken by
-
MAPREDUCE-6302 Preempt reducers after a configurable timeout irrespective of headroom
- Closed
- is related to
-
MAPREDUCE-6513 MR job got hanged forever when one NM unstable for some time
- Closed