Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30689

Allow custom resource scheduling to work with YARN versions that don't support custom resource scheduling

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • Spark Core, YARN
    • None

    Description

      Many people/companies will not be moving to Hadoop 3.1 or greater, where it supports custom resource scheduling for things like GPUs soon and have requested support for it in older hadoop 2.x versions. This also means that they may not have isolation enabled which is what the default behavior relies on.

      right now the option is to write a custom discovery script to handle on their own. This is ok but has some limitation because the script runs as a separate process.  It also just a shell script.

      I think we can make this a lot more flexible by making the entire resource discovery class pluggable. The default one would stay as is and call the discovery script, but if an advanced user wanted to replace the entire thing they could implement a pluggable class which they could write custom code on how to discovery resource addresses.

      This will also help users if they are running hadoop 3.1.x or greater but don't have the resources configured or aren't running in an isolated environment.

      Attachments

        Activity

          People

            tgraves Thomas Graves
            tgraves Thomas Graves
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: