Currently, the short-circuit local read pathway allows HDFS clients to access files directly without going through the DataNode. However, all of these reads involve a copy at the operating system level, since they rely on the read() / pread() / etc family of kernel interfaces.
We would like to enable HDFS to read local files via mmap. This would enable truly zero-copy reads.
In the initial implementation, zero-copy reads will only be performed when checksums were disabled. Later, we can use the DataNode's cache awareness to only perform zero-copy reads when we know that checksum has already been verified.