Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Sometimes MEMORY_AND_DISK mode is slower than DISK_ONLY mode because of the lock on IO operations(dropping blocks in memory store). As the TODO says, the solution is: only synchronize the selecting of to-be-dropped blocks and do the dropping in parallel. I have a quick fix in my PR: https://github.com/apache/spark/pull/791#issuecomment-43567924
It's fragile currently but I'm working on it to make it more robust.
Attachments
Issue Links
- duplicates
-
SPARK-3000 Drop old blocks to disk in parallel when memory is not large enough for caching new blocks
- Resolved
- links to