Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Due to the asynchronous nature of QueryInfo$FinishableState notification, we usually end up receiving finishable state updates across tasks/queryfragments with some time difference.
Imagine vertex Map8 with dependency on Map7.
If Map8 vertex is already running with some tasks still pending, we can end-up in a situation where on Map7 completion, some of the pending Map8 tasks are getting the finishable state update BEFORE the already running Map8 tasks, ending up preempting tasks for no reason!
2021-04-22T15:30:45.124Z source:Map 7 updated, notifying: [Map 8] 2021-04-22T15:30:45.125Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received finishable state update for attempt_1619105382691_0001_1_05_000014_0, state=true 2021-04-22T15:30:45.125Z query-executor-0 class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server handler 3 on 25000"] Now notifying: Map 8 2021-04-22T15:30:45.125Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="Wait-Queue-Scheduler-0"] Attempting to execute TaskWrapper{task=attempt_1619105382691_0001_1_05_000014_0, Vertex=Map 8, inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true, canFinish=true, canFinish(in 2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="Wait-Queue-Scheduler-0"] Task TaskWrapper{task=attempt_1619105382691_0001_1_05_000014_0, Vertex=Map 8, inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true, canFinish=true, canFinish(in queue)=true, isGuaranteed=false, firstAttemptStartTime=1619105437749, dagStartTime=1619105422608, withinDagPriority=74, vertexParallelism= 232, selfAndUpstreamParallelism= 256, selfAndUpstreamComplete= 17} managed to preempt task TaskWrapper{task=attempt_1619105382691_0001_1_05_000006_0, Vertex=Map 8, inWaitQueue=false, inPreemptionQueue=true, registeredForNotifications=true, canFinish=true, canFinish(in queue)=false, isGuaranteed=false, firstAttemptStartTime=1619105437737, dagStartTime=1619105422608, withinDagPriority=74, vertexParallelism= 232, selfAndUpstreamParallelism= 256, selfAndUpstreamComplete= 15} 2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="Wait-Queue-Scheduler-0"] Invoking kill task for attempt_1619105382691_0001_1_05_000006_0 due to pre-emption to run attempt_1619105382691_0001_1_05_000014_0 2021-04-22T15:30:45.126Z query-executor-0 class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server handler 3 on 25000"] Now notifying: Map 8 2021-04-22T15:30:45.126Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received finishable state update for attempt_1619105382691_0001_1_05_000011_0, state=true 2021-04-22T15:30:45.127Z query-executor-0 class="impl.TaskRunnerCallable" level="INFO" thread="Wait-Queue-Scheduler-0"] Kill task requested for id=attempt_1619105382691_0001_1_05_000006_0, taskRunnerSetup=true 2021-04-22T15:30:45.127Z query-executor-0 class="impl.TaskRunnerCallable" level="INFO" thread="Wait-Queue-Scheduler-0"] Issuing kill to task attempt_1619105382691_0001_1_05_000006_0 2021-04-22T15:30:45.127Z query-executor-0 class="impl.QueryInfo$FinishableStateTracker" level="INFO" thread="IPC Server handler 3 on 25000"] Now notifying: Map 8 2021-04-22T15:30:45.127Z query-executor-0 class="task.TezTaskRunner2" level="INFO" thread="Wait-Queue-Scheduler-0"] Attempting to abort attempt_1619105382691_0001_1_05_000006_0 due to an invocation of killTask 2021-04-22T15:30:45.128Z query-executor-0 class="tez.TezProcessor" level="INFO" thread="Wait-Queue-Scheduler-0"] Received abort 2021-04-22T15:30:45.128Z query-executor-0 class="tez.TezProcessor" level="INFO" thread="Wait-Queue-Scheduler-0"] Forwarding abort to RecordProcessor 2021-04-22T15:30:45.128Z query-executor-0 class="runtime.LogicalIOProcessorRuntimeTask" dagId="dag_1619105382691_0001_1" fragmentId="1619105382691_0001_1_05_000011_0" level="INFO" queryId="hive_20210422153013_397b96bf-d5a6-493a-9c51-9446f64eeed4" thread="TezTR-382691_1_1_5_11_0"] Waiting for 1 initializers to finish 2021-04-22T15:30:45.128Z query-executor-0 class="tez.MapRecordProcessor" level="INFO" thread="Wait-Queue-Scheduler-0"] Forwarding abort to mapOp: {} MAP 2021-04-22T15:30:45.128Z query-executor-0 class="vector.VectorMapOperator" level="INFO" thread="Wait-Queue-Scheduler-0"] Received abort in operator: MAP 2021-04-22T15:30:45.128Z query-executor-0 class="impl.TaskRunnerCallable" level="INFO" thread="Wait-Queue-Scheduler-0"] Kill request for task attempt_1619105382691_0001_1_05_000006_0 completed. Informing AM 2021-04-22T15:30:45.128Z query-executor-0 class="impl.TaskExecutorService" level="INFO" thread="IPC Server handler 3 on 25000"] vertex: Map 8 Received finishable state update for attempt_1619105382691_0001_1_05_000006_0, state=true
Attachments
Attachments
Issue Links
- links to