[HADOOP-8148] Zero-copy ByteBuffer-based compressor / decompressor API - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: io, performance
Labels:
None

Description

Per Todd Lipcon's comment in ~~HDFS-2834~~, "
Whenever a native decompression codec is being used, ... we generally have the following copies:

1) Socket -> DirectByteBuffer (in SocketChannel implementation)
2) DirectByteBuffer -> byte[] (in SocketInputStream)
3) byte[] -> Native buffer (set up for decompression)
4*) decompression to a different native buffer (not really a copy - decompression necessarily rewrites)
5) native buffer -> byte[]

with the proposed improvement we can hopefully eliminate #2,#3 for all applications, and #2,#3,and #5 for libhdfs.
"

The interfaces in the attached patch attempt to address:
A - Compression and decompression based on ByteBuffers (~~HDFS-2834~~)
B - Zero-copy compression and decompression (HDFS-3051)
C - Provide the caller a way to know how the max space required to hold compressed output.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hadoop8148.patch
07/Mar/12 19:37
8 kB
Tim Broberg
hadoop-8148.patch
27/Jun/12 17:06
6 kB
Owen O'Malley
zerocopyifc.tgz
03/Jul/12 01:31
2 kB
Tim Broberg

Issue Links

is duplicated by

HADOOP-9689 Implement HDFS Zero-copy reading

Resolved

is related to

HADOOP-8258 Add interfaces for compression codecs to use direct byte buffers

Resolved

HDFS-3051 A zero-copy ScatterGatherRead api from FSDataInputStream

Open

HDFS-2834 ByteBuffer-based read API for DFSInputStream

Closed

relates to

HADOOP-10047 Add a directbuffer Decompressor API to hadoop

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Tim Broberg

Votes:: 0 Vote for this issue

Watchers:: 29 Start watching this issue

Dates

Created:: 07/Mar/12 19:30

Updated:: 09/Mar/15 21:13