Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.0, 3.2.0
-
None
Description
Some version of Kubernetes may create a deletion timestamp field before changing the pod status to terminating, so a decommissioning node may have a deletion timestamp and a stage of running. Depending on when the K8s snapshot comes back this can cause a race condition with Spark believing the pod has been deleted before it has been.