Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-8646

Intermittent TestIOFencing#testFencingAroundCompaction failure due to region getting stuck in compaction

    Details

    • Type: Test
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.95.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      From http://54.241.6.143/job/HBase-TRUNK/org.apache.hbase$hbase-server/348/testReport/junit/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompaction/ (the underlying region is tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1.):

      2013-05-29 19:25:20,363 DEBUG [pool-1-thread-1] catalog.CatalogTracker(208): Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@6280d069
      2013-05-29 19:25:20,366 INFO  [pool-1-thread-1] hbase.TestIOFencing(255): Waiting for compaction to be about to start
      2013-05-29 19:25:20,367 DEBUG [pool-1-thread-1] hbase.TestIOFencing$CompactionBlockerRegion(107): waiting for compaction to block
      2013-05-29 19:25:20,367 DEBUG [pool-1-thread-1] hbase.TestIOFencing$CompactionBlockerRegion(109): compaction block reached
      2013-05-29 19:25:20,367 INFO  [pool-1-thread-1] hbase.TestIOFencing(257): Starting a new server
      2013-05-29 19:25:20,424 DEBUG [pool-1-thread-1] client.HConnectionManager(2811): regionserver/ip-10-197-74-184.us-west-1.compute.internal/10.197.74.184:0 HConnection server-to-server retries=100
      ...
      2013-05-29 19:25:20,861 INFO  [pool-1-thread-1] hbase.TestIOFencing(260): Killing region server ZK lease
      ...
      2013-05-29 19:25:21,030 DEBUG [RS_CLOSE_REGION-ip-10-197-74-184.us-west-1.compute.internal,37836,1369855503920-0] handler.CloseRegionHandler(125): Processing close of tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1.
      2013-05-29 19:25:21,031 DEBUG [RS_CLOSE_REGION-ip-10-197-74-184.us-west-1.compute.internal,37836,1369855503920-0] regionserver.HRegion(928): Closing tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1.: disabling compactions & flushes
      2013-05-29 19:25:21,031 DEBUG [RS_CLOSE_REGION-ip-10-197-74-184.us-west-1.compute.internal,37836,1369855503920-0] regionserver.HRegion(1022): waiting for 1 compactions to complete for region tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1.
      ...
      2013-05-29 19:25:27,037 INFO  [pool-1-thread-1] hbase.TestIOFencing(265): Waiting for the new server to pick up the region tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1.
      

      The test started new region server. However, the region got stuck in:

        public void waitForFlushesAndCompactions() {
          synchronized (writestate) {
            while (writestate.compacting > 0 || writestate.flushing) {
              LOG.debug("waiting for " + writestate.compacting + " compactions"
                  + (writestate.flushing ? " & cache flush" : "") + " to complete for region " + this);
              try {
                writestate.wait();
      

      This led to the timeout:

              assertTrue("Timed out waiting for new server to open region",
                System.currentTimeMillis() - startWaitTime < 60000);
      

        Attachments

          Activity

            People

            • Assignee:
              enis Enis Soztutar
              Reporter:
              yuzhihong@gmail.com Ted Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: