Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.3.1
Description
This is a follow-up to SPARK-24296.
When replicating a disk-cached block, even if we fetch-to-disk, we still memory-map the file, just to copy it to another location.
Ideally we'd just move the tmp file to the right location. But even without that, we could read the file as an input stream, instead of memory-mapping the whole thing. Memory-mapping is particularly a problem when running under yarn, as the OS may believe there is plenty of memory available, meanwhile yarn decides to kill the process for exceeding memory limits.
Attachments
Issue Links
- links to