Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
3.0.1
-
None
-
None
Description
In case Cluster does not have sufficient resource (CPU/ Memory ) for minimum number of executors , the executors goes in Pending State for indefinite time until the resource gets free.
Suppose, Cluster Configurations are:
total Memory=204Gi
used Memory=200Gi
free memory= 4Gi
SPARK.EXECUTOR.MEMORY=10G
SPARK.DYNAMICALLOCTION.MINEXECUTORS=4
SPARK.DYNAMICALLOCATION.MAXEXECUTORS=8
Rather, the job should be cancelled if requested number of minimum executors are not availableĀ at that point of time because of resource unavailability.
Currently it is doing partial scheduling or no scheduling and waiting indefinitely. And the job got stuck.