Hadoop Common
  1. Hadoop Common
  2. HADOOP-5476

calling new SequenceFile.Reader(...) leaves an InputStream open, if the given sequence file is broken

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.1, 0.20.0
    • Fix Version/s: 0.21.0
    • Component/s: io
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Issue Links

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk #830 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/830/)
        . Close the underlying InputStream in SequenceFile::Reader when
        the constructor throws an exception. Contributed by Michael Tamm

        Show
        Hudson added a comment - Integrated in Hadoop-trunk #830 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/830/ ) . Close the underlying InputStream in SequenceFile::Reader when the constructor throws an exception. Contributed by Michael Tamm
        Hide
        Chris Douglas added a comment -

        I committed this. Thanks, Michael

        Show
        Chris Douglas added a comment - I committed this. Thanks, Michael
        Hide
        Chris Douglas added a comment -
             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        

        Running the unit tests...

        Show
        Chris Douglas added a comment - [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Running the unit tests...
        Hide
        Michael Tamm added a comment -

        I attached the third version of the patch, which uses IOUtils.cleanup now.

        Show
        Michael Tamm added a comment - I attached the third version of the patch, which uses IOUtils.cleanup now.
        Hide
        Chris Douglas added a comment -

        If seek or init throws a RuntimeException, it's wrong. Unfortunately, init does throw IllegalArgumentException if the codec lookup fails. sigh

        You're right, try-finally is required. I'd still use IOUtils::cleanup to close the stream and drop the comments, though.

        Show
        Chris Douglas added a comment - If seek or init throws a RuntimeException, it's wrong. Unfortunately, init does throw IllegalArgumentException if the codec lookup fails. sigh You're right, try-finally is required. I'd still use IOUtils::cleanup to close the stream and drop the comments, though.
        Hide
        Michael Tamm added a comment -

        Hmm, that would not cover all error cases:
        if a RuntimeException is thrown by seek(start) or init(tempReader),
        in would not be closed.

        For me it is a best practice to use try-finally to close opened resources.

        Show
        Michael Tamm added a comment - Hmm, that would not cover all error cases: if a RuntimeException is thrown by seek(start) or init(tempReader), in would not be closed. For me it is a best practice to use try-finally to close opened resources.
        Hide
        Chris Douglas added a comment -

        Thanks for updating the patch; I meant the first try/finally, which is actually distinguishing the case where an IOException is thrown from the case where it is not. That is better expressed as:

        this.conf = conf;
        try { 
          seek(start);
          this.end = in.getPos() + length;
          init(tempReader);
        } catch (IOException e) { 
          try {
            in.close();
          } catch (IOException ce) {
            LOG.info("Exception from close in init: ", ce);
          }
          throw e;
        }
        

        or even more succinctly:

        try {
          seek(start);
          this.end = in.getPos() + length;
          init(tempReader);
        } catch (IOException e) {
          IOUtils.cleanup(LOG, in);
          throw e;
        }
        
        Show
        Chris Douglas added a comment - Thanks for updating the patch; I meant the first try/finally, which is actually distinguishing the case where an IOException is thrown from the case where it is not. That is better expressed as: this .conf = conf; try { seek(start); this .end = in.getPos() + length; init(tempReader); } catch (IOException e) { try { in.close(); } catch (IOException ce) { LOG.info( "Exception from close in init: " , ce); } throw e; } or even more succinctly: try { seek(start); this .end = in.getPos() + length; init(tempReader); } catch (IOException e) { IOUtils.cleanup(LOG, in); throw e; }
        Hide
        Michael Tamm added a comment -

        The updated patch addresses all mentioned problems and the tests no longer throws a NPE.

        Show
        Michael Tamm added a comment - The updated patch addresses all mentioned problems and the tests no longer throws a NPE.
        Hide
        Chris Douglas added a comment -

        Sorry for the long delay in reviewing this.

        • TestSequenceFile fails on trunk:
          Testcase: testCloseForErroneousSequenceFile took 0.102 sec
                  Caused an ERROR
          null
          java.lang.NullPointerException
                  at org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.read(RawLocalFileSystem.java:91)
                  at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(RawLocalFileSystem.java:142)
                  at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
                  at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
                  at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
                  at java.io.DataInputStream.readFully(DataInputStream.java:178)
                  at java.io.DataInputStream.readFully(DataInputStream.java:152)
                  at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:134)
                  at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
                  at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1453)
                  at org.apache.hadoop.io.TestSequenceFile$1.openFile(TestSequenceFile.java:494)
                  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1430)
                  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1423)
                  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1418)
                  at org.apache.hadoop.io.TestSequenceFile$1.<init>(TestSequenceFile.java:491)
                  at org.apache.hadoop.io.TestSequenceFile.testCloseForErroneousSequenceFile(TestSequenceFile.java:491)
          

          If the unit test obtains the LFS by calling FileSystem.getLocal(conf) then everything works as designed.

        • The conf assignment can be outside the try block
        • Instead of try/finally, catching IOException, attempting a close, logging- rather than ignoring- exceptions thrown from close, then re-throwing the IOException would be preferred.
        Show
        Chris Douglas added a comment - Sorry for the long delay in reviewing this. TestSequenceFile fails on trunk: Testcase: testCloseForErroneousSequenceFile took 0.102 sec Caused an ERROR null java.lang.NullPointerException at org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.read(RawLocalFileSystem.java:91) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(RawLocalFileSystem.java:142) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:134) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1453) at org.apache.hadoop.io.TestSequenceFile$1.openFile(TestSequenceFile.java:494) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1430) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1423) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1418) at org.apache.hadoop.io.TestSequenceFile$1.<init>(TestSequenceFile.java:491) at org.apache.hadoop.io.TestSequenceFile.testCloseForErroneousSequenceFile(TestSequenceFile.java:491) If the unit test obtains the LFS by calling FileSystem.getLocal(conf) then everything works as designed. The conf assignment can be outside the try block Instead of try/finally, catching IOException, attempting a close, logging- rather than ignoring- exceptions thrown from close, then re-throwing the IOException would be preferred.
        Hide
        Michael Tamm added a comment -

        I have attached a patch, which

        • adds the new method testCloseForErroneousSequenceFile to TestSequenceFile.java to check this bug
        • fixes the SequenceFile.Reader constructor
        Show
        Michael Tamm added a comment - I have attached a patch, which adds the new method testCloseForErroneousSequenceFile to TestSequenceFile.java to check this bug fixes the SequenceFile.Reader constructor

          People

          • Assignee:
            Michael Tamm
            Reporter:
            Michael Tamm
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development