Details
-
New Feature
-
Status: Reopened
-
Major
-
Resolution: Unresolved
-
2.4.0, 3.0.0
-
None
-
None
Description
Spark's BlacklistTracker maintains a list of "bad nodes" which it will not use for running tasks (eg., because of bad hardware). When running in yarn, this blacklist is used to avoid ever allocating resources on blacklisted nodes: https://github.com/apache/spark/blob/e836c27ce011ca9aef822bef6320b4a7059ec343/resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala#L128
I'm just beginning to poke around the kubernetes code, so apologies if this is incorrect – but I didn't see any references to scheduler.nodeBlacklist() in KubernetesClusterSchedulerBackend so it seems this is missing. Thought of this while looking at SPARK-19755, a similar issue on mesos.
Attachments
Issue Links
- is related to
-
SPARK-19755 Blacklist is always active for MesosCoarseGrainedSchedulerBackend. As result - scheduler cannot create an executor after some time.
- Resolved
-
SPARK-16630 Blacklist a node if executors won't launch on it.
- Resolved