Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.2.0
Description
Currently shuffle data is not cleaned up when an external shuffle service is used and the associated executor has been deallocated before the shuffle is cleaned up. Shuffle data is only cleaned up once the application ends.
There have been various issues filed for this:
https://issues.apache.org/jira/browse/SPARK-26020
https://issues.apache.org/jira/browse/SPARK-17233
https://issues.apache.org/jira/browse/SPARK-4236
But shuffle files will still stick around until an application completes. Dynamic allocation is commonly used for long running jobs (such as structured streaming), so any long running jobs with a large shuffle involved will eventually fill up local disk space. The shuffle service already supports cleaning up shuffle service persisted RDDs, so it should be able to support cleaning up shuffle blocks as well once the shuffle is removed by the ContextCleaner.
The current alternative is to use shuffle tracking instead of an external shuffle service, but this is less optimal from a resource perspective as all executors must be kept alive until the shuffle has been fully consumed and cleaned up (and with the default GC interval being 30 minutes this can waste a lot of time with executors held onto but not doing anything).
Attachments
Issue Links
- is related to
-
SPARK-47448 Enable spark.shuffle.service.removeShuffle by default
- Resolved
- relates to
-
SPARK-38005 Support cleaning up merged shuffle files and state from external shuffle service
- Resolved
- links to