Details
-
Improvement
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
3.0.0
-
None
-
None
Description
While we've done some good work with better handling when Spark is choosing to decommission nodes (SPARK-7955), it might make sense in environments where we get preempted without our own choice (e.g. YARN over-commit, EC2 spot instances, GCE Preemptiable instances, etc.) to do something for the data on the node (or at least not schedule any new tasks).
Attachments
Issue Links
- is related to
-
SPARK-7955 Dynamic allocation: longer timeout for executors with cached blocks
-
- Closed
-
-
SPARK-3174 Provide elastic scaling within a Spark application
-
- Closed
-
-
SPARK-33005 Kubernetes GA Preparation
-
- Resolved
-
-
SPARK-41550 Dynamic Allocation on K8S GA
-
- Resolved
-
- links to
1.
|
Improve cache block migration |
|
Open | Unassigned |
2.
|
Add an option to reject block migrations when under disk pressure |
|
Open | Unassigned |
3.
|
Improve ExecutorDecommissionInfo and ExecutorDecommissionState for different use cases |
|
In Progress | Unassigned |
4.
|
Rename all decommission configurations to use the same namespace "spark.decommission.*" |
|
In Progress | Unassigned |
5.
|
Do not drop cached RDD blocks to accommodate blocks from decommissioned block manager if enough memory is not available |
|
In Progress | Unassigned |
6.
|
Decommission executors in batches to avoid overloading network by block migrations. |
|
In Progress | Unassigned |
7.
|
Put blocks only on disk while migrating RDD cached data |
|
In Progress | Unassigned |
8.
|
Decommission logs too frequent when waiting migration to finish |
|
In Progress | Apache Spark |
9.
|
Add support for YARN decommissioning when ESS is Enabled |
|
In Progress | Unassigned |