Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2258

IFile reader closes stream and compressor in wrong order

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.23.0
    • Component/s: task
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In IFile.Reader.close(), we return the decompressor to the pool and then call close() on the input stream. This is backwards and causes a rare race in the case of LzopCodec, since LzopInputStream makes a few calls on the decompressor object inside close(). If another thread pulls the decompressor out of the pool and starts to use it in the meantime, the first thread's close() will cause the second thread to potentially miss pieces of data.

        Issue Links

          Activity

          Hide
          Todd Lipcon added a comment -

          The following is a unit test I wrote on the hadoop-lzo side that does the same behavior as IFile.Reader.close():

          https://github.com/toddlipcon/hadoop-lzo/commit/a5af3b93f52f55828dfc05e7503d38383eec9dc5

          It fails reliably since some threads only manage to read part of the data in the file.

          Show
          Todd Lipcon added a comment - The following is a unit test I wrote on the hadoop-lzo side that does the same behavior as IFile.Reader.close(): https://github.com/toddlipcon/hadoop-lzo/commit/a5af3b93f52f55828dfc05e7503d38383eec9dc5 It fails reliably since some threads only manage to read part of the data in the file.
          Hide
          Hong Tang added a comment -

          Such pattern would in general affect all CompressionCodec's and is similar to a bug I filed earlier: HADOOP-4195.

          On the other hand, as explained by Chris D in HADOOP-4162, LzopCodec cannot be safely reused in Hadoop, and thus the problem you described actually should not happen. In fact, repeatedly getting LzopCodec from CodecPool is likely get you into OOM.

          Show
          Hong Tang added a comment - Such pattern would in general affect all CompressionCodec's and is similar to a bug I filed earlier: HADOOP-4195 . On the other hand, as explained by Chris D in HADOOP-4162 , LzopCodec cannot be safely reused in Hadoop, and thus the problem you described actually should not happen. In fact, repeatedly getting LzopCodec from CodecPool is likely get you into OOM.
          Hide
          Todd Lipcon added a comment -

          Hong: I agree LzoCodec is preferable to LzopCodec for use in intermediate compression, but I think the bug you referenced is no longer the case. LzopCodecs can now be pooled properly with Chris's patch you referenced plus changes on the lzo side.

          Just to clarify, you agree this code in IFile is wrong and should be fixed, right?

          Show
          Todd Lipcon added a comment - Hong: I agree LzoCodec is preferable to LzopCodec for use in intermediate compression, but I think the bug you referenced is no longer the case. LzopCodecs can now be pooled properly with Chris's patch you referenced plus changes on the lzo side. Just to clarify, you agree this code in IFile is wrong and should be fixed, right?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12468714/mapreduce-2258.txt
          against trunk revision 1074251.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/37//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/37//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/37//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12468714/mapreduce-2258.txt against trunk revision 1074251. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/37//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/37//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/37//console This message is automatically generated.
          Hide
          Tom White added a comment -

          +1

          Show
          Tom White added a comment - +1
          Hide
          Tom White added a comment -

          I just committed this. Thanks, Todd!

          Show
          Tom White added a comment - I just committed this. Thanks, Todd!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #682 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/682/)
          MAPREDUCE-2258. IFile reader closes stream and compressor in wrong order. Contributed by Todd Lipcon.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1124383
          Files :

          • /hadoop/mapreduce/trunk/CHANGES.txt
          • /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/IFile.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #682 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/682/ ) MAPREDUCE-2258 . IFile reader closes stream and compressor in wrong order. Contributed by Todd Lipcon. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1124383 Files : /hadoop/mapreduce/trunk/CHANGES.txt /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/IFile.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #684 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/684/)
          MAPREDUCE-2258. IFile reader closes stream and compressor in wrong order. Contributed by Todd Lipcon.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1124383
          Files :

          • /hadoop/mapreduce/trunk/CHANGES.txt
          • /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/IFile.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #684 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/684/ ) MAPREDUCE-2258 . IFile reader closes stream and compressor in wrong order. Contributed by Todd Lipcon. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1124383 Files : /hadoop/mapreduce/trunk/CHANGES.txt /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/IFile.java

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development