Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.0
-
None
Description
SPARK-33099 added the support to respect "spark.dynamicAllocation.executorIdleTimeout" in ExecutorPodsAllocator. However, when it checks if a pending executor pod is timed out, it checks against the pod's "startTime". A pending pod "startTime" is empty, and this causes the function "isExecutorIdleTimedOut()" always return true for pending pods.
This caused the issue, pending pods are deleted immediately when a stage is finished and several new pods got recreated again in the next stage.