Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13723

YARN - Change behavior of --num-executors when spark.dynamicAllocation.enabled true

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0
    • Spark Core, YARN
    • None

    Description

      I think we should change the behavior when --num-executors is specified when dynamic allocation is enabled. Currently if --num-executors is specified dynamic allocation is disabled and it just uses a static number of executors.

      I would rather see the default behavior changed in the 2.x line. If dynamic allocation config is on then num-executors goes to max and initial # of executors. I think this would allow users to easily cap their usage and would still allow it to free up executors. It would also allow users doing ML start out with a # of executors and if they are actually caching the data the executors wouldn't be freed up. So you would get very similar behavior to if dynamic allocation was off.

      Part of the reason for this is when using a static number if generally wastes resources, especially with people doing adhoc things with spark-shell. It also has a big affect when people are doing MapReduce/ETL type work loads. The problem is that people are used to specifying num-executors so if we turn it on by default in a cluster config its just overridden.

      We should also update the spark-submit --help description for --num-executors

      Attachments

        Issue Links

          Activity

            People

              rdblue Ryan Blue
              tgraves Thomas Graves
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: