[SPARK-22765] Create a new executor allocation scheme based on that of MR - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: 1.6.0
Fix Version/s: None
Component/s: Scheduler, Spark Core
Labels:
- bulk-closed

Description

Many users migrating their workload from MR to Spark find a significant resource consumption hike (i.e, ~~SPARK-22683~~). While this might not be a concern for users that are more performance centric, for others conscious about cost, such hike creates a migration obstacle. This situation can get worse as more users are moving to cloud.

Dynamic allocation make it possible for Spark to be deployed in multi-tenant environment. With its performance-centric design, its inefficiency has also unfortunately shown up, especially when compared with MR. Thus, it's believed that MR-styled scheduler still has its merit. Based on our research, the inefficiency associated with dynamic allocation comes in many aspects such as executor idling out, bigger executors, many stages (rather than 2 stages only in MR) in a spark job, etc.

Rather than fine tuning dynamic allocation for efficiency, the proposal here is to add a new, efficiency-centric scheduling scheme based on that of MR. Such a MR-based scheme can be further enhanced and be more adapted to Spark execution model. This alternative is expected to offer good performance improvement (compared to MR) still with similar to or even better efficiency than MR.

Inputs are greatly welcome!

Attachments

Issue Links

is related to

SPARK-22683 DynamicAllocation wastes resources by allocating containers that will barely be used

Resolved

relates to

SPARK-22870 Dynamic allocation should allow 0 idle time

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Xuefu Zhang

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 12/Dec/17 21:57

Updated:: 17/May/20 17:47

Resolved:: 21/May/19 04:11