HBase
  1. HBase
  2. HBASE-9023

TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.96.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Any one want to take a look at this one?

      https://builds.apache.org/job/HBase-TRUNK/4283/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/

      java.lang.AssertionError
      	at org.junit.Assert.fail(Assert.java:86)
      	at org.junit.Assert.assertTrue(Assert.java:41)
      	at org.junit.Assert.assertTrue(Assert.java:52)
      	at org.apache.hadoop.hbase.TestIOFencing.doTest(TestIOFencing.java:263)
      	at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompactionAfterWALSync(TestIOFencing.java:217)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.junit.runners.Suite.runChild(Suite.java:127)
      	at org.junit.runners.Suite.runChild(Suite.java:26)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
      	at java.lang.Thread.run(Thread.java:662)
      
      1. 9023-v1.txt
        1 kB
        Ted Yu
      2. 9023.addendum1
        0.9 kB
        Ted Yu

        Issue Links

          Activity

          Hide
          stack added a comment -

          Resolving fixed. This used to fail frequently but looks like Ted fixed it.

          Show
          stack added a comment - Resolving fixed. This used to fail frequently but looks like Ted fixed it.
          Hide
          stack added a comment -

          Hold your horses. It fails sporadically. Lets wait a few days.

          Show
          stack added a comment - Hold your horses. It fails sporadically. Lets wait a few days.
          Hide
          Ted Yu added a comment -

          TestIOFencing has been passing on Apache Jenkins and EC2 for both 0.95 and trunk.

          Show
          Ted Yu added a comment - TestIOFencing has been passing on Apache Jenkins and EC2 for both 0.95 and trunk.
          Hide
          Hudson added a comment -

          FAILURE: Integrated in hbase-0.95-on-hadoop2 #252 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/252/)
          HBASE-9023 Addendum waits for two flushes (tedyu: rev 1515251)

          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Show
          Hudson added a comment - FAILURE: Integrated in hbase-0.95-on-hadoop2 #252 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/252/ ) HBASE-9023 Addendum waits for two flushes (tedyu: rev 1515251) /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in hbase-0.95 #468 (See https://builds.apache.org/job/hbase-0.95/468/)
          HBASE-9023 Addendum waits for two flushes (tedyu: rev 1515251)

          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Show
          Hudson added a comment - SUCCESS: Integrated in hbase-0.95 #468 (See https://builds.apache.org/job/hbase-0.95/468/ ) HBASE-9023 Addendum waits for two flushes (tedyu: rev 1515251) /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Hide
          Ted Yu added a comment -

          Integrated addendum to 0.95 as well.

          Show
          Ted Yu added a comment - Integrated addendum to 0.95 as well.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #685 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/685/)
          HBASE-9023 Addendum makes the wait for flush look for 2 store files (tedyu: rev 1515189)

          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #685 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/685/ ) HBASE-9023 Addendum makes the wait for flush look for 2 store files (tedyu: rev 1515189) /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #4409 (See https://builds.apache.org/job/HBase-TRUNK/4409/)
          HBASE-9023 Addendum makes the wait for flush look for 2 store files (tedyu: rev 1515189)

          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4409 (See https://builds.apache.org/job/HBase-TRUNK/4409/ ) HBASE-9023 Addendum makes the wait for flush look for 2 store files (tedyu: rev 1515189) /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java
          Hide
          Ted Yu added a comment -

          Addendum integrated to trunk.

          Thanks for the review.

          Show
          Ted Yu added a comment - Addendum integrated to trunk. Thanks for the review.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12598662/9023.addendum1
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12598662/9023.addendum1 against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6803//console This message is automatically generated.
          Hide
          stack added a comment -

          +1 on trying the addendum.

          Show
          stack added a comment - +1 on trying the addendum.
          Hide
          Ted Yu added a comment -

          Addendum changes the wait for second flush to complete.

          Show
          Ted Yu added a comment - Addendum changes the wait for second flush to complete.
          Hide
          Ted Yu added a comment -

          It turns out that flush detection logic still needs to be refined.
          In one test run, here was the first completion of flush:

          2013-08-17 15:10:18,370 INFO  [Thread-282] regionserver.HStore(760): Added hdfs://localhost.localdomain:53662/user/jenkins/hbase/data/default/tabletest/174fab922ef8fceeb0342279c97aadb7/family/4f528c163ab64226829fd13b7ae3f4b0, entries=2319, sequenceid=2323, filesize=77.9 K
          2013-08-17 15:10:18,370 INFO  [Thread-282] regionserver.HRegion(1639): Finished memstore flush of ~390.8 K/400144, currentsize=0/0 for region tabletest,,1376777411816.174fab922ef8fceeb0342279c97aadb7. in 651ms, sequenceid=2323, compaction requested=false
          

          However, the test failed with:

          java.lang.AssertionError: lastFlushTime: 1376777411967 current: 1376777418370
          	at org.junit.Assert.fail(Assert.java:88)
          	at org.junit.Assert.assertTrue(Assert.java:41)
          	at org.apache.hadoop.hbase.TestIOFencing.doTest(TestIOFencing.java:264)
          	at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompactionAfterWALSync(TestIOFencing.java:218)
          

          1376777418370, in milliseconds, corresponds to 2013-08-17 15:10:18.
          Obviously the reading of 1376777411967 as the previous flush time was incorrect. It was due to the assignment in HRegion#initializeRegionInternals():

              this.lastFlushTime = EnvironmentEdgeManager.currentTimeMillis();
          
          Show
          Ted Yu added a comment - It turns out that flush detection logic still needs to be refined. In one test run, here was the first completion of flush: 2013-08-17 15:10:18,370 INFO [ Thread -282] regionserver.HStore(760): Added hdfs: //localhost.localdomain:53662/user/jenkins/hbase/data/ default /tabletest/174fab922ef8fceeb0342279c97aadb7/family/4f528c163ab64226829fd13b7ae3f4b0, entries=2319, sequenceid=2323, filesize=77.9 K 2013-08-17 15:10:18,370 INFO [ Thread -282] regionserver.HRegion(1639): Finished memstore flush of ~390.8 K/400144, currentsize=0/0 for region tabletest,,1376777411816.174fab922ef8fceeb0342279c97aadb7. in 651ms, sequenceid=2323, compaction requested= false However, the test failed with: java.lang.AssertionError: lastFlushTime: 1376777411967 current: 1376777418370 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.TestIOFencing.doTest(TestIOFencing.java:264) at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompactionAfterWALSync(TestIOFencing.java:218) 1376777418370, in milliseconds, corresponds to 2013-08-17 15:10:18. Obviously the reading of 1376777411967 as the previous flush time was incorrect. It was due to the assignment in HRegion#initializeRegionInternals(): this .lastFlushTime = EnvironmentEdgeManager.currentTimeMillis();
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in hbase-0.95 #457 (See https://builds.apache.org/job/hbase-0.95/457/)
          HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514550)

          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Show
          Hudson added a comment - SUCCESS: Integrated in hbase-0.95 #457 (See https://builds.apache.org/job/hbase-0.95/457/ ) HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514550) /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #679 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/679/)
          HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514538)

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Show
          Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #679 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/679/ ) HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514538) /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in hbase-0.95-on-hadoop2 #246 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/246/)
          HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514550)

          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Show
          Hudson added a comment - SUCCESS: Integrated in hbase-0.95-on-hadoop2 #246 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/246/ ) HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514550) /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Hide
          Ted Yu added a comment -

          Integrated to 0.95 as well.

          Show
          Ted Yu added a comment - Integrated to 0.95 as well.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #4399 (See https://builds.apache.org/job/HBase-TRUNK/4399/)
          HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514538)

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4399 (See https://builds.apache.org/job/HBase-TRUNK/4399/ ) HBASE-9023 TestIOFencing.testFencingAroundCompactionAfterWALSync occasionally fails (tedyu: rev 1514538) /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          Hide
          Ted Yu added a comment -

          Integrated to trunk.

          Will watch Jenkins builds for related test failure.

          Show
          Ted Yu added a comment - Integrated to trunk. Will watch Jenkins builds for related test failure.
          Hide
          Ted Yu added a comment -

          From https://builds.apache.org/job/PreCommit-HBASE-Build/6767/console :

          HBASE-9023 patch is being downloaded at Thu Aug 15 22:04:37 UTC 2013 from
          http://issues.apache.org/jira/secure/attachment/12598280/9023-v1.txt
          ...
          FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
          	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
          	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
          

          Don't know what caused the abort.

          Show
          Ted Yu added a comment - From https://builds.apache.org/job/PreCommit-HBASE-Build/6767/console : HBASE-9023 patch is being downloaded at Thu Aug 15 22:04:37 UTC 2013 from http: //issues.apache.org/jira/secure/attachment/12598280/9023-v1.txt ... FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34) Don't know what caused the abort.
          Hide
          Ted Yu added a comment -

          Thanks for the review.

          Plan to integrate after Hadoop QA reports back.

          Leave the issue open meantime?

          Sure.

          Show
          Ted Yu added a comment - Thanks for the review. Plan to integrate after Hadoop QA reports back. Leave the issue open meantime? Sure.
          Hide
          stack added a comment -

          +1 on trying the patch to see if it fixes the issue. Leave the issue open meantime?

          Show
          stack added a comment - +1 on trying the patch to see if it fixes the issue. Leave the issue open meantime?
          Hide
          Ted Yu added a comment -

          The check against compactingRegion.getLastFlushTime() interprets the return value as the completion time of the more recent flush.
          Without patch, this assumption is not true. Meaning flushing may still be in progress coming out of the while loop mentioned above.
          Patch v1 lets getLastFlushTime() represent the completion time of last flush.

          Show
          Ted Yu added a comment - The check against compactingRegion.getLastFlushTime() interprets the return value as the completion time of the more recent flush. Without patch, this assumption is not true. Meaning flushing may still be in progress coming out of the while loop mentioned above. Patch v1 lets getLastFlushTime() represent the completion time of last flush.
          Hide
          stack added a comment -

          How does your patch 'fix' it Ted Yu?

          Show
          stack added a comment - How does your patch 'fix' it Ted Yu ?
          Hide
          Ted Yu added a comment -

          Looped TestIOFencing 30 times on Linux and all passed.
          There was no hanging test.

          Show
          Ted Yu added a comment - Looped TestIOFencing 30 times on Linux and all passed. There was no hanging test.
          Hide
          Ted Yu added a comment -

          Patch v1 moves the assignment of lastFlushTime to immediately before notifyAll().

          Please comment.

          Show
          Ted Yu added a comment - Patch v1 moves the assignment of lastFlushTime to immediately before notifyAll(). Please comment.
          Hide
          Ted Yu added a comment -

          The latest test failure was due to timing.
          TestIOFencing waits for flush to complete by checking:

                while (compactingRegion.getLastFlushTime() <= lastFlushTime) {
                  LOG.info("Waiting for the region to flush " + compactingRegion.getRegionNameAsString());
          

          In test output, I don't see the above LOG.
          HRegion#lastFlushTime is assigned at the beginning of internalFlushcache(). This means that the test might come out of the loop prematurely due to flush still in progress.

          Show
          Ted Yu added a comment - The latest test failure was due to timing. TestIOFencing waits for flush to complete by checking: while (compactingRegion.getLastFlushTime() <= lastFlushTime) { LOG.info( "Waiting for the region to flush " + compactingRegion.getRegionNameAsString()); In test output, I don't see the above LOG. HRegion#lastFlushTime is assigned at the beginning of internalFlushcache(). This means that the test might come out of the loop prematurely due to flush still in progress.
          Show
          stack added a comment - Failed this morning https://builds.apache.org/job/hbase-0.95/453/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/ Anyone want to take a look at this?
          Show
          stack added a comment - Failed this evening https://builds.apache.org/job/hbase-0.95/435/testReport/junit/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompaction/
          Show
          stack added a comment - Failed this evening: https://builds.apache.org/job/HBase-TRUNK/4331/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/
          Show
          stack added a comment - Failed this morning: https://builds.apache.org/job/HBase-TRUNK/4328/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/
          Show
          stack added a comment - Failed again: https://builds.apache.org/job/hbase-0.95/386/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/
          Hide
          stack added a comment -

          We are out of fds.

          This is not right. It is just a misreading of the way ResourceChecker logs. If we were out of fds we would see "Too many open files" in the logs.

          Show
          stack added a comment - We are out of fds. This is not right. It is just a misreading of the way ResourceChecker logs. If we were out of fds we would see "Too many open files" in the logs.
          Hide
          stack added a comment -

          It is probably just the overcommitted box:

          Thread LEAK? -, OpenFileDescriptor=174 (was 162) - OpenFileDescriptor LEAK? -, MaxFileDescriptor=40000 (was 40000), SystemLoadAverage=351 (was 383), ProcessCount=142 (was 144), AvailableMemoryMB=819 (was 892), ConnectionCount=0 (was 0)
          

          We are out of fds.

          Show
          stack added a comment - It is probably just the overcommitted box: Thread LEAK? -, OpenFileDescriptor=174 (was 162) - OpenFileDescriptor LEAK? -, MaxFileDescriptor=40000 (was 40000), SystemLoadAverage=351 (was 383), ProcessCount=142 (was 144), AvailableMemoryMB=819 (was 892), ConnectionCount=0 (was 0) We are out of fds.

            People

            • Assignee:
              Ted Yu
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development