Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
One of our jobs reading avro hit OOM due to the buffer copy in compress and decompress methods which is very inefficient.
java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3236) at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191) at org.apache.avro.file.DeflateCodec.decompress(DeflateCodec.java:84)
I would suggest using a class that extends ByteArrrayOutputStream like https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java#L51-L53
and do
ByteBuffer result = ByteBuffer.wrap(buf.getData(), 0, buf.getLength());
Attachments
Issue Links
- links to