[SPARK-44951] Improve Spark Dynamic Allocation - ASF JIRA

Rank to Top

Rank to Bottom

Attach files

Attach Screenshot

Bulk Copy Attachments

Bulk Move Attachments

Add vote

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Labels

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

For Spark 4 we should aim to improve Spark's dynamic allocation. Some potential ideas here includes the following:

Plug-gable DEA algorithms
How to reduce wastage on the RM side? Sometimes the driver asks for some units of resources. But when RM provisions them, the driver cancels it.
Support for "warm" executor pools which are not tied to a particular driver but start and wait for a driver to connect to them to "claim" them.
More explicit Cost Vs AppRunTime confiugration: A good DEA algo should allow the developer to choose between cost and runtime. Sometimes developers might be ok to pay higher costs for faster execution.
Use previous run information to inform future runs
Better selection of executors to be scaled down

1.	Log a warning (or automatically disable) when shuffle tracking is enabled along side another DA supported mechanism	Resolved	binjie yang	Actions
2.	Make DEA algorithms pluggable	Open	Unassigned	Actions
3.	Add the option for dynamically marking containers for preemption based data	Open	Unassigned	Actions