Description
This issue will occur only in hadoop 23.x & above/
In hadoop 0.20.x
public static void returnDecompressor(Decompressor decompressor) { if (decompressor == null) { return; } decompressor.reset(); payback(decompressorPool, decompressor); }
In hadoop 0.23.x
public static void returnDecompressor(Decompressor decompressor) { if (decompressor == null) { return; } // if the decompressor can't be reused, don't pool it. if (decompressor.getClass().isAnnotationPresent(DoNotPool.class)) { return; } decompressor.reset(); payback(decompressorPool, decompressor); }
Here annotation has been added. By default this library will be loaded if there are no native library.
@DoNotPool public class BuiltInGzipDecompressor
Due to this each time new compressor/decompressor will be loaded, this leads to native memory leak.
2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.gz] 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.gz] 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.gz]