Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The heap-memory path of ByteBufferUtil.fallbackRead (see master branch code here massively overallocates memory when the underlying input stream returns data in smaller chunks. This happens on a regular basis when using the S3 input stream as input.
The behavior is an O(N^2)-ish. In a recent debug session, we were trying to read 6MB, but getting 16K at a time. The code would:
- allocate 16M, use the first 16K
- allocate 16M - 16K, use the first 16K of that
- allocate 16M - 32K, use the first 16K of that
- (etc)
The patch is simple. Here's the text version of the patch:
@@ -88,10 +88,17 @@ public final class ByteBufferUtil { buffer.flip(); } else { buffer.clear(); - int nRead = stream.read(buffer.array(), - buffer.arrayOffset(), maxLength); - if (nRead >= 0) { - buffer.limit(nRead); + int totalRead = 0; + while (totalRead < maxLength) { + final int nRead = stream.read(buffer.array(), + buffer.arrayOffset() + totalRead, maxLength - totalRead); + if (nRead <= 0) { + break; + } + totalRead += nRead; + } + if (totalRead >= 0) { + buffer.limit(totalRead); success = true; } }
so, essentially, do the same thing that the code in the direct memory path is doing