Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.2.0, 3.3.0, 3.4.0, 3.5.0
Description
When spark.sql.execution.interruptOnCancel=true and spark.io.compression.codec=zstd, a memory leak was found when tasks were cancelled at specific times. The reason for this is that cancelling a task interrupts the shuffle write, which then calls org.apache.spark.storage.DiskBlockObjectWriter#closeResources. this process then only closes the ManualCloseOutputStream, which is wrapped with this ZstdInputStreamNoFinalizer will not be closed. Moreover, ZstdInputStreamNoFinalizer doesn't implement Finalizer so it won't be reclaimed by GC automatically.
Attachments
Issue Links
- links to