Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
-
ghx-label-10
Description
- If node “a” has a higher rate of task failures compared to the rest of the cluster, then node “a” should be blacklisted
- There should only be a specific set of failures that count against a node - e.g. query specific failures like reading corrupted files, or mem limit exceeded should not count
- This is similar to how Spark Executor Blacklisting works
Attachments
Issue Links
- Blocked
-
IMPALA-10477 Mark executor node as down if it repeatedly failed to startup fragment instance
- Open