Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-9024

TestLogRolling fails/goes zombie

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • None
    • None
    • test
    • None

    Description

      TestLogRolling.testLogRollOnPipelineRestart failed on hadoop1 here: https://builds.apache.org/job/hbase-0.95/352/consoleText It went zombie.
      In the double thread dump on the end:

      "pool-1-thread-1" prio=10 tid=0x73f9dc00 nid=0x3a34 in Object.wait() [0x7517d000]
         java.lang.Thread.State: TIMED_WAITING (on object monitor)
      	at java.lang.Object.wait(Native Method)
      	- waiting on <0xcf624ad0> (a java.util.concurrent.atomic.AtomicLong)
      	at org.apache.hadoop.hbase.client.AsyncProcess.waitForNextTaskDone(AsyncProcess.java:634)
      	- locked <0xcf624ad0> (a java.util.concurrent.atomic.AtomicLong)
      	at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:659)
      	at org.apache.hadoop.hbase.client.AsyncProcess.waitUntilDone(AsyncProcess.java:670)
      	at org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:813)
      	at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1170)
      	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:753)
      	at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.doPut(TestLogRolling.java:640)
      	at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.writeData(TestLogRolling.java:248)
      	at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnPipelineRestart(TestLogRolling.java:515)
      

      ... we are stuck here.
      The math looks like it could go wonky. But looking in the output for the test, it seems that when this test ran we got this:

      2013-07-23 01:23:29,560 INFO [pool-1-thread-1] hbase.HBaseTestingUtility(922): Minicluster is down
      2013-07-23 01:23:29,574 INFO [pool-1-thread-1] hbase.ResourceChecker(171): after: regionserver.wal.TestLogRolling#testLogRollOnPipelineRestart Thread=39 (was 31) - Thread LEAK? -, OpenFileDescriptor=312 (was 272) - OpenFileDescriptor LEAK? -, MaxFileDescriptor=40000 (was 40000), SystemLoadAverage=351 (was 368), ProcessCount=144 (was 142) - ProcessCount LEAK? -, AvailableMemoryMB=906 (was 1995), ConnectionCount=0 (was 0)
      

      This test has a history of failures. See HBASE-5995 where it was fixed and reenabled once. Thought was that it was a hadoop2 issue but this cited failure is on hadoop1.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: