Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Invalid
-
2.4.3
-
None
-
None
-
redhat 7
jdk 1.8
scala 2.11.12
spark standalone cluster 2.4.3
kafka 0.10.2.1
Description
the folder in Driver
/tmp/temporary-xxxxxxxx
takes up all the space in /tmp after runing spark structured stream job a long time.
it is mainly under the offsets and commits folders.but when I watch it by us command
du -sh offsets du -sh commits
it got more than 600M,but when We use command
ll -h offsets ll -h commits
it got 400K.
I think it is because when the file is deleted,it is still used in job.
It wasn't released only if the job is stopped.
How can I solve it?
We use
df.writeStream.trigger(ProcessingTime("1 seconds"))
not
df.writeStream.trigger(Continuous("1 seconds"))
Is there something wrong here?
Attachments
Issue Links
- is caused by
-
SPARK-28025 HDFSBackedStateStoreProvider should not leak .crc files
- Resolved