Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19755

Blacklist is always active for MesosCoarseGrainedSchedulerBackend. As result - scheduler cannot create an executor after some time.

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.1.0
    • None
    • Mesos, Scheduler, Spark Core
    • mesos, marathon, docker - driver and executors are dockerized.

    Description

      When for some reason task fails - MesosCoarseGrainedSchedulerBackend increased failure counter for a slave where that task was running.
      When counter is >=2 (MAX_SLAVE_FAILURES) mesos slave is excluded.
      Over time scheduler cannot create a new executor - every slave is is in the blacklist. Task failure not necessary related to host health- especially for long running stream apps.
      If accepted as a bug: possible solution is to use: spark.blacklist.enabled to make that functionality optional and if it make sense MAX_SLAVE_FAILURES also can be configurable.

      Attachments

        Issue Links

          Activity

            kayousterhout Kay Ousterhout added a comment -

            I'm closing this because the configs you're proposing adding already exist: spark.blacklist.enabled already exists to turn of all blacklisting (this is false by default, so the fact that you're seeing blacklisting behavior means that your configuration enables blacklisting), and spark.blacklist.maxFailedTaskPerExecutor is the other thing you proposed adding. All of the blacklisting parameters are listed here: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/config/package.scala#L101

            Feel free to re-open this if I've misunderstood and the existing configs don't address the issues you're seeing!

            kayousterhout Kay Ousterhout added a comment - I'm closing this because the configs you're proposing adding already exist: spark.blacklist.enabled already exists to turn of all blacklisting (this is false by default, so the fact that you're seeing blacklisting behavior means that your configuration enables blacklisting), and spark.blacklist.maxFailedTaskPerExecutor is the other thing you proposed adding. All of the blacklisting parameters are listed here: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/config/package.scala#L101 Feel free to re-open this if I've misunderstood and the existing configs don't address the issues you're seeing!
            timout Timur Abakumov added a comment -

            You are right - configuration parameter exists.
            But from what I can see - MesosCoarseGrainedSchedulerBackend.scala does not use it.
            It uses hard coded MAX_SLAVE_FAILURES = 2.
            if I missed something - please explain it.
            I have fixed it for my company and can create pull request and assign it to you.

            timout Timur Abakumov added a comment - You are right - configuration parameter exists. But from what I can see - MesosCoarseGrainedSchedulerBackend.scala does not use it. It uses hard coded MAX_SLAVE_FAILURES = 2. if I missed something - please explain it. I have fixed it for my company and can create pull request and assign it to you.
            apachespark Apache Spark added a comment -

            User 'timout' has created a pull request for this issue:
            https://github.com/apache/spark/pull/17619

            apachespark Apache Spark added a comment - User 'timout' has created a pull request for this issue: https://github.com/apache/spark/pull/17619
            igor.berman Igor Berman added a comment -

            This Jira is very relevant for the case when running with  dynamic allocation turned on, where starting and stopping executors is part of natural lifecycle of the driver. The chances to fail when starting executor are increasing(e.g. due to transient port collisions)

             

            The threshold of 2 seems too low and artificial for this usecases. I've observed situation where at some point almost 1/3 of mesos-slave nodes are marked as blacklisted(but they were ok). This creates situation where the cluster has free resources but frameworks can't use them since they actively decline offers from the master.

            igor.berman Igor Berman added a comment - This Jira is very relevant for the case when running with  dynamic allocation turned on, where starting and stopping executors is part of natural lifecycle of the driver. The chances to fail when starting executor are increasing(e.g. due to transient port collisions)   The threshold of 2 seems too low and artificial for this usecases. I've observed situation where at some point almost 1/3 of mesos-slave nodes are marked as blacklisted(but they were ok). This creates situation where the cluster has free resources but frameworks can't use them since they actively decline offers from the master.
            apachespark Apache Spark added a comment -

            User 'IgorBerman' has created a pull request for this issue:
            https://github.com/apache/spark/pull/20640

            apachespark Apache Spark added a comment - User 'IgorBerman' has created a pull request for this issue: https://github.com/apache/spark/pull/20640

            People

              Unassigned Unassigned
              timout Timur Abakumov
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: