[SPARK-13723] YARN - Change behavior of --num-executors when spark.dynamicAllocation.enabled true - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 2.0.0
Fix Version/s: 2.0.0
Component/s: Spark Core, YARN
Labels:
None

Description

I think we should change the behavior when --num-executors is specified when dynamic allocation is enabled. Currently if --num-executors is specified dynamic allocation is disabled and it just uses a static number of executors.

I would rather see the default behavior changed in the 2.x line. If dynamic allocation config is on then num-executors goes to max and initial # of executors. I think this would allow users to easily cap their usage and would still allow it to free up executors. It would also allow users doing ML start out with a # of executors and if they are actually caching the data the executors wouldn't be freed up. So you would get very similar behavior to if dynamic allocation was off.

Part of the reason for this is when using a static number if generally wastes resources, especially with people doing adhoc things with spark-shell. It also has a big affect when people are doing MapReduce/ETL type work loads. The problem is that people are used to specifying num-executors so if we turn it on by default in a cluster config its just overridden.

We should also update the spark-submit --help description for --num-executors

Attachments

Issue Links

relates to

SPARK-19380 YARN - Dynamic allocation should use configured number of executors as max number of executors

Resolved

links to

[Github] Pull Request #13338 (rdblue)

Activity

People

Assignee:: Ryan Blue

Reporter:: Thomas Graves

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 07/Mar/16 17:03

Updated:: 17/May/20 18:16

Resolved:: 23/Jun/16 19:04