Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12007

GzipCodec native CodecPool leaks memory

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.7.0
    • None
    • None

    Description

      org/apache/hadoop/io/compress/GzipCodec.java call CompressionCodec.Util.createOutputStreamWithCodecPool to use CodecPool. But compressor objects are actually never returned to pool which cause memory leak.

      HADOOP-10591 uses CompressionOutputStream.close() to return Compressor object to pool. But CompressionCodec.Util.createOutputStreamWithCodecPool actually returns a CompressorStream which overrides close().

      This cause CodecPool.returnCompressor never being called. In my log file I can see lots of "Got brand-new compressor [.gz]" but no "Got recycled compressor".

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yejun Yejun Yang
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h
                  3h