Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16554

Spark should kill executors when they are blacklisted

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.2.0
    • Component/s: Scheduler
    • Labels:
      None

      Description

      SPARK-8425 will allow blacklisting faulty executors and nodes. However, these blacklisted executors will continue to run. This is bad for a few reasons:

      (1) Even if there is faulty-hardware, if the cluster is under-utilized spark may be able to request another executor on a different node.

      (2) If there is a faulty-disk (the most common case of faulty-hardware), the cluster manager may be able to allocate another executor on the same node, if it can exclude the bad disk. (Yarn will do this with its disk-health checker.)

      With dynamic allocation, this may seem less critical, as a blacklisted executor will stop running new tasks and eventually get reclaimed. However, if there is cached data on those executors, they will not get killed till spark.dynamicAllocation.cachedExecutorIdleTimeout expires, which is (effectively) infinite by default.

      Users may not always want to kill bad executors, so this must be configurable to some extent. At a minimum, it should be possible to enable / disable it; perhaps the executor should be killed after it has been blacklisted a configurable N times.

        Issue Links

          Activity

          Hide
          mridulm80 Mridul Muralidharan added a comment -

          It would also be good if we can also move currently hosted blocks off the node before it is killed as a best case effort.
          This will reduce recomputation (if single copy of rdd block) and/or prevent under replication (if block replication > 1)

          Show
          mridulm80 Mridul Muralidharan added a comment - It would also be good if we can also move currently hosted blocks off the node before it is killed as a best case effort. This will reduce recomputation (if single copy of rdd block) and/or prevent under replication (if block replication > 1)
          Hide
          jsoltren Jose Soltren added a comment -

          This is a nice idea but I'm not sure how feasible it is.

          If we're going to kill an executor, it is because there has already been a task failure which makes the executor suspect. There could be a network problem, or lost or corrupt local storage. Even if we could recover hosted blocks off the executor I'm not certain how trustworthy they would be.

          There would need to be some mechanism in place to determine if recovered blocks are valid, and perhaps a maximum number of tries or a timeout for this step as well. I'm not certain how often this would provide a benefit as compared to simply recomputing using lineage.

          I'll continue to think through this as I look further at the code.

          Show
          jsoltren Jose Soltren added a comment - This is a nice idea but I'm not sure how feasible it is. If we're going to kill an executor, it is because there has already been a task failure which makes the executor suspect. There could be a network problem, or lost or corrupt local storage. Even if we could recover hosted blocks off the executor I'm not certain how trustworthy they would be. There would need to be some mechanism in place to determine if recovered blocks are valid, and perhaps a maximum number of tries or a timeout for this step as well. I'm not certain how often this would provide a benefit as compared to simply recomputing using lineage. I'll continue to think through this as I look further at the code.
          Hide
          jsoltren Jose Soltren added a comment -

          I have some changes ready, but I'm going to wait for https://github.com/apache/spark/pull/16346 to land before sending a pull review to avoid churn. Hopefully this happens in the next day or two.

          Show
          jsoltren Jose Soltren added a comment - I have some changes ready, but I'm going to wait for https://github.com/apache/spark/pull/16346 to land before sending a pull review to avoid churn. Hopefully this happens in the next day or two.
          Hide
          jsoltren Jose Soltren added a comment -

          Builds on some BlacklistTracker changes that should land first.

          Show
          jsoltren Jose Soltren added a comment - Builds on some BlacklistTracker changes that should land first.
          Hide
          apachespark Apache Spark added a comment -

          User 'jsoltren' has created a pull request for this issue:
          https://github.com/apache/spark/pull/16650

          Show
          apachespark Apache Spark added a comment - User 'jsoltren' has created a pull request for this issue: https://github.com/apache/spark/pull/16650
          Hide
          irashid Imran Rashid added a comment -

          Issue resolved by pull request 16650
          https://github.com/apache/spark/pull/16650

          Show
          irashid Imran Rashid added a comment - Issue resolved by pull request 16650 https://github.com/apache/spark/pull/16650

            People

            • Assignee:
              jsoltren Jose Soltren
              Reporter:
              irashid Imran Rashid
              Shepherd:
              Imran Rashid
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development