Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0, 3.1.0
-
None
Description
Dynamic scaling on Kubernetes (introduced in Spark 3) depends on only shutting down executors without shuffle files. However Spark does not aggressively clean up shuffle files (seeĀ SPARK-5836) and instead depends on JVM GC on the driver to trigger deletes. We already have a mechanism to explicitly clean up shuffle files from the ALS algorithm where we create a lot of quickly orphaned shuffle files. We should expose this as an advanced developer feature to enable people to better clean-up shuffle files improving dynamic scaling of their jobs on Kubernetes.
Attachments
Issue Links
- is related to
-
SPARK-38417 Remove `Experimental` from `RDD.cleanShuffleDependencies` API
- Resolved
- relates to
-
SPARK-5836 Highlight in Spark documentation that by default Spark does not delete its temporary files
- Resolved
- links to