Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16630

Blacklist a node if executors won't launch on it.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.2
    • 2.4.0
    • Spark Core, YARN
    • None

    Description

      On YARN, its possible that a node is messed or misconfigured such that a container won't launch on it. For instance if the Spark external shuffle handler didn't get loaded on it , maybe its just some other hardware issue or hadoop configuration issue.

      It would be nice we could recognize this happening and stop trying to launch executors on it since that could end up causing us to hit our max number of executor failures and then kill the job.

      Attachments

        Issue Links

          Activity

            People

              attilapiros Attila Zsolt Piros
              tgraves Thomas Graves
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: