Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5235

HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.90.5, 0.92.0
    • 0.90.6, 0.92.1
    • None
    • None
    • Reviewed

    Description

      Pls find the analysis. Correct me if am wrong

      2012-01-15 05:14:02,374 FATAL org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-9 Got while writing log entry to log
      java.io.IOException: All datanodes 10.18.40.200:50010 are bad. Aborting...
      	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3373)
      	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2811)
      	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3026)
      
      

      Here we have an exception in one of the writer threads. If any exception we try to hold it in an Atomic variable

        private void writerThreadError(Throwable t) {
          thrown.compareAndSet(null, t);
        }
      

      In the finally block of splitLog we try to close the streams.

            for (WriterThread t: writerThreads) {
              try {
                t.join();
              } catch (InterruptedException ie) {
                throw new IOException(ie);
              }
              checkForErrors();
            }
            LOG.info("Split writers finished");
            
            return closeStreams();
      

      Inside checkForErrors

        private void checkForErrors() throws IOException {
          Throwable thrown = this.thrown.get();
          if (thrown == null) return;
          if (thrown instanceof IOException) {
            throw (IOException)thrown;
          } else {
            throw new RuntimeException(thrown);
          }
        }
      So once we throw the exception the DFSStreamer threads are not getting closed.
      

      Attachments

        1. HBASE-5235_0.90_1.patch
          4 kB
          ramkrishna.s.vasudevan
        2. HBASE-5235_0.90_2.patch
          4 kB
          ramkrishna.s.vasudevan
        3. HBASE-5235_0.90.patch
          3 kB
          ramkrishna.s.vasudevan
        4. HBASE-5235_trunk.patch
          4 kB
          ramkrishna.s.vasudevan

        Activity

          I think if any errors we should only close the Streams and not call closeStreams().
          Because here we do lot of other steps that completes the log split process.
          Also even if one WriterThread has an exception can we completely abort the master as we did for any failures of splitLog()? Pls suggest.

          ram_krish ramkrishna.s.vasudevan added a comment - I think if any errors we should only close the Streams and not call closeStreams(). Because here we do lot of other steps that completes the log split process. Also even if one WriterThread has an exception can we completely abort the master as we did for any failures of splitLog()? Pls suggest.
          zhihyu@ebaysf.com Ted Yu added a comment -

          closeStreams() records all the IOException's and throw a MultipleIOException before exiting.
          I think the simplest solution is to wrap closeStreams() in a finally block in finishWritingAndClose()

          Thanks for reporting this issue, Ram.

          zhihyu@ebaysf.com Ted Yu added a comment - closeStreams() records all the IOException's and throw a MultipleIOException before exiting. I think the simplest solution is to wrap closeStreams() in a finally block in finishWritingAndClose() Thanks for reporting this issue, Ram.

          But Ted, in closeStreams() we also do the steps related to renaming the .temp files in recovered.edits. So we should only do the

               for (WriterAndPath wap : logWriters.values()) {
                  try {
                    wap.w.close();
                  } catch (IOException ioe) {
                    LOG.error("Couldn't close log at " + wap.p, ioe);
                    thrown.add(ioe);
                    continue;
                  }
          

          and make the master abort so that the subsequent split can parse the HLog. Correct me if am wrong.

          ram_krish ramkrishna.s.vasudevan added a comment - But Ted, in closeStreams() we also do the steps related to renaming the .temp files in recovered.edits. So we should only do the for (WriterAndPath wap : logWriters.values()) { try { wap.w.close(); } catch (IOException ioe) { LOG.error( "Couldn't close log at " + wap.p, ioe); thrown.add(ioe); continue ; } and make the master abort so that the subsequent split can parse the HLog. Correct me if am wrong.
          zhihyu@ebaysf.com Ted Yu added a comment -

          How about introducing a boolean, logWritersClosed (similar to hasClosed) ?
          We separate the above for loop into a new private method called closeLogWriters() where logWritersClosed is checked upon entry. After successful execution logWritersClosed would be set to true before exit.
          closeStreams() calls closeLogWriters().
          We also place closeLogWriters() in the finally block of finishWritingAndClose().

          Is the above close to what you were thinking, Ram ?

          zhihyu@ebaysf.com Ted Yu added a comment - How about introducing a boolean, logWritersClosed (similar to hasClosed) ? We separate the above for loop into a new private method called closeLogWriters() where logWritersClosed is checked upon entry. After successful execution logWritersClosed would be set to true before exit. closeStreams() calls closeLogWriters(). We also place closeLogWriters() in the finally block of finishWritingAndClose(). Is the above close to what you were thinking, Ram ?
          ram_krish ramkrishna.s.vasudevan added a comment - - edited

          Yes. This is what i was thinking too. Will upload a patch

          ram_krish ramkrishna.s.vasudevan added a comment - - edited Yes. This is what i was thinking too. Will upload a patch

          Patch for 0.90. If this patch is fine i will prepare a similar patch for 0.92

          ram_krish ramkrishna.s.vasudevan added a comment - Patch for 0.90. If this patch is fine i will prepare a similar patch for 0.92
          zhihyu@ebaysf.com Ted Yu added a comment -
          +      // close the log writer streams only if they are not closed
          +      // in closeStreams().
          +      if (!closeCompleted && !logWritersClosed) {
          

          Do we need to check closeCompleted here ? It is set after logWritersClosed is set to true.

          I think closeAndCleanupCompleted would be a better name for hasClosed.
          The following line should be in closeLogWriters():

          +      logWritersClosed = true;
          

          If I were you, I would put the loop from line 789 to 798 into closeLogWriters() and let closeStreams() call closeLogWriters()
          Then closeLogWriters() would start for loop to iterate over logWriters.values()

          zhihyu@ebaysf.com Ted Yu added a comment - + // close the log writer streams only if they are not closed + // in closeStreams(). + if (!closeCompleted && !logWritersClosed) { Do we need to check closeCompleted here ? It is set after logWritersClosed is set to true. I think closeAndCleanupCompleted would be a better name for hasClosed. The following line should be in closeLogWriters(): + logWritersClosed = true ; If I were you, I would put the loop from line 789 to 798 into closeLogWriters() and let closeStreams() call closeLogWriters() Then closeLogWriters() would start for loop to iterate over logWriters.values()
          zhihyu@ebaysf.com Ted Yu added a comment -

          The rationale for my above comment is that in patch v1, the following assignment may be skipped if there is runtime exception thrown in the try block of line 801:

          +      logWritersClosed = true;
          

          It would make the code cleaner if logWritersClosed only reflects the status of log writer closing.

          zhihyu@ebaysf.com Ted Yu added a comment - The rationale for my above comment is that in patch v1, the following assignment may be skipped if there is runtime exception thrown in the try block of line 801: + logWritersClosed = true ; It would make the code cleaner if logWritersClosed only reflects the status of log writer closing.

          Addressing Ted's comments. Here the logWriter.values() will be iterated twice.

          ram_krish ramkrishna.s.vasudevan added a comment - Addressing Ted's comments. Here the logWriter.values() will be iterated twice.
          zhihyu@ebaysf.com Ted Yu added a comment -

          +1 on patch v2.
          Minor comments which can be addressed at time of commit:

          +        LOG.info("Closed path " + wap.p + " (wrote " + wap.editsWritten
          +            + " edits in " + (wap.nanosSpent / 1000 / 1000) + "ms)");
          

          I think the above should be inside closeLogWriters().

          +      // close the log writer streams only if they are not closed
          +      // in closeStreams().
          

          Since closeLogWriters() is called inside closeStreams(), the above comment can be removed.

          Please run through test suite.

          zhihyu@ebaysf.com Ted Yu added a comment - +1 on patch v2. Minor comments which can be addressed at time of commit: + LOG.info( "Closed path " + wap.p + " (wrote " + wap.editsWritten + + " edits in " + (wap.nanosSpent / 1000 / 1000) + "ms)" ); I think the above should be inside closeLogWriters(). + // close the log writer streams only if they are not closed + // in closeStreams(). Since closeLogWriters() is called inside closeStreams(), the above comment can be removed. Please run through test suite.

          Patch for trunk.

          ram_krish ramkrishna.s.vasudevan added a comment - Patch for trunk.
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12511401/HBASE-5235_trunk.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/827//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/827//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/827//console

          This message is automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511401/HBASE-5235_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/827//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/827//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/827//console This message is automatically generated.

          Updated patch addressing Ted's comments for 0.90. Already trunk patch incorporates Ted's comments.

          ram_krish ramkrishna.s.vasudevan added a comment - Updated patch addressing Ted's comments for 0.90. Already trunk patch incorporates Ted's comments.
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12511427/HBASE-5235_0.90_2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/833//console

          This message is automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511427/HBASE-5235_0.90_2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/833//console This message is automatically generated.
          hudson Hudson added a comment -

          Integrated in HBase-TRUNK #2644 (See https://builds.apache.org/job/HBase-TRUNK/2644/)
          HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.(Ram)

          ramkrishna :
          Files :

          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          hudson Hudson added a comment - Integrated in HBase-TRUNK #2644 (See https://builds.apache.org/job/HBase-TRUNK/2644/ ) HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.(Ram) ramkrishna : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

          Committed to 0.90 and trunk.
          Thanks for the review Ted.

          ram_krish ramkrishna.s.vasudevan added a comment - Committed to 0.90 and trunk. Thanks for the review Ted.
          zhihyu@ebaysf.com Ted Yu added a comment -

          Patch should be integrated to 0.92 branch as well.

          zhihyu@ebaysf.com Ted Yu added a comment - Patch should be integrated to 0.92 branch as well.
          hudson Hudson added a comment -

          Integrated in HBase-TRUNK-security #85 (See https://builds.apache.org/job/HBase-TRUNK-security/85/)
          HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.(Ram)

          ramkrishna :
          Files :

          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          hudson Hudson added a comment - Integrated in HBase-TRUNK-security #85 (See https://builds.apache.org/job/HBase-TRUNK-security/85/ ) HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.(Ram) ramkrishna : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          hudson Hudson added a comment -

          Integrated in HBase-0.92 #257 (See https://builds.apache.org/job/HBase-0.92/257/)
          HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions. (Ram)

          ramkrishna :
          Files :

          • /hbase/branches/0.92/CHANGES.txt
          • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          hudson Hudson added a comment - Integrated in HBase-0.92 #257 (See https://builds.apache.org/job/HBase-0.92/257/ ) HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions. (Ram) ramkrishna : Files : /hbase/branches/0.92/CHANGES.txt /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          hudson Hudson added a comment -

          Integrated in HBase-0.92-security #88 (See https://builds.apache.org/job/HBase-0.92-security/88/)
          HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions. (Ram)

          ramkrishna :
          Files :

          • /hbase/branches/0.92/CHANGES.txt
          • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          hudson Hudson added a comment - Integrated in HBase-0.92-security #88 (See https://builds.apache.org/job/HBase-0.92-security/88/ ) HBASE-5235 HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions. (Ram) ramkrishna : Files : /hbase/branches/0.92/CHANGES.txt /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

          Committed to 0.92, trunk and 0.90

          ram_krish ramkrishna.s.vasudevan added a comment - Committed to 0.92, trunk and 0.90
          larsfrancke Lars Francke added a comment -

          This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).

          larsfrancke Lars Francke added a comment - This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).

          People

            ram_krish ramkrishna.s.vasudevan
            ram_krish ramkrishna.s.vasudevan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: