Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Per Todd Lipcon's comment in HDFS-2834, "
Whenever a native decompression codec is being used, ... we generally have the following copies:
1) Socket -> DirectByteBuffer (in SocketChannel implementation)
2) DirectByteBuffer -> byte[] (in SocketInputStream)
3) byte[] -> Native buffer (set up for decompression)
4*) decompression to a different native buffer (not really a copy - decompression necessarily rewrites)
5) native buffer -> byte[]
with the proposed improvement we can hopefully eliminate #2,#3 for all applications, and #2,#3,and #5 for libhdfs.
"
The interfaces in the attached patch attempt to address:
A - Compression and decompression based on ByteBuffers (HDFS-2834)
B - Zero-copy compression and decompression (HDFS-3051)
C - Provide the caller a way to know how the max space required to hold compressed output.
Attachments
Attachments
Issue Links
- is duplicated by
-
HADOOP-9689 Implement HDFS Zero-copy reading
- Resolved
- is related to
-
HADOOP-8258 Add interfaces for compression codecs to use direct byte buffers
- Resolved
-
HDFS-3051 A zero-copy ScatterGatherRead api from FSDataInputStream
- Open
-
HDFS-2834 ByteBuffer-based read API for DFSInputStream
- Closed
- relates to
-
HADOOP-10047 Add a directbuffer Decompressor API to hadoop
- Closed