Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-10767

Load balancer may interfere with tests in TestHBaseFsck

    XMLWordPrintableJSON

    Details

    • Type: Test
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.1, 0.99.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When analyzing test failure shown in https://builds.apache.org/job/HBase-TRUNK/5018/testReport/org.apache.hadoop.hbase.util/TestHBaseFsck/testHbckThreadpooling/ , I saw the following events:

      2014-03-16 04:02:05,721 INFO  [juno.apache.org,41286,1394942220308-BalancerChore] master.HMaster(1490): balance hri=NoHdfsTable,,1394942287079.                             f66b7b32c7bfed7cc8637ee9b033ef14., src=juno.apache.org,37923,1394942221817, dest=juno.apache.org,57897,1394942221876
      2014-03-16 04:02:05,721 DEBUG [juno.apache.org,41286,1394942220308-BalancerChore] master.AssignmentManager(2239): Starting unassign of NoHdfsTable,,1394942287079.          f66b7b32c7bfed7cc8637ee9b033ef14. (offlining), current state: {f66b7b32c7bfed7cc8637ee9b033ef14 state=OPEN, ts=1394942287774, server=juno.apache.org,37923,1394942221817}
      ...
      2014-03-16 04:02:05,742 DEBUG [juno.apache.org,41286,1394942220308-BalancerChore] master.AssignmentManager(1704): Offline NoHdfsTable,,1394942287079.f66b7b32c7bfed7cc8637ee9b033ef14., it's not any more on juno.apache.org,37923,1394942221817
      org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: The region f66b7b32c7bfed7cc8637ee9b033ef14 is not online, and is not opening.
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2617)
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3796)
      ...
      2014-03-16 04:02:05,754 DEBUG [juno.apache.org,41286,1394942220308-BalancerChore] master.AssignmentManager(2191): No previous transition plan found (or ignoring an existing plan) for NoHdfsTable,,1394942287079.f66b7b32c7bfed7cc8637ee9b033ef14.; generated random plan=hri=NoHdfsTable,,1394942287079.f66b7b32c7bfed7cc8637ee9b033ef14., src=, dest=juno.apache.org,57897,1394942221876; 3 (online=3, available=3) available servers, forceNewPlan=false
      ...
      2014-03-16 04:02:05,786 DEBUG [RS_OPEN_REGION-juno:57897-0] regionserver.HRegion(4402): Opening region: {ENCODED => f66b7b32c7bfed7cc8637ee9b033ef14, NAME => 'NoHdfsTable,,1394942287079.f66b7b32c7bfed7cc8637ee9b033ef14.', STARTKEY => '', ENDKEY => 'A'}
      ...
      2014-03-16 04:02:05,787 DEBUG [RS_OPEN_REGION-juno:57897-0] regionserver.HRegion(563): Instantiated NoHdfsTable,,1394942287079.f66b7b32c7bfed7cc8637ee9b033ef14.
      ...
      2014-03-16 04:02:06,862 DEBUG [pool-1-thread-1] util.HBaseFsck(1452): Loading region dirs from hdfs://localhost:48141/user/jenkins/hbase/data/default/NoHdfsTable
      

      Load balancer tried to balance region NoHdfsTable,,1394942287079.f66b7b32c7bfed7cc8637ee9b033ef14. - possibly due to regionPlan generated when NoHdfsTable was still around.
      However juno.apache.org,37923,1394942221817 didn't serve this region any more. So balancer moved this region to juno.apache.org,57897,1394942221876.
      At 04:02:05,787, region was instantiated on juno.apache.org,57897,1394942221876.
      Soon this region was picked up by loadHdfsRegionDirs called in HBaseFsck.
      This created an unexpected empty HbckInfo via the call to getOrCreateInfo.

        Attachments

        1. 10767-v1.txt
          0.6 kB
          Ted Yu

          Activity

            People

            • Assignee:
              yuzhihong@gmail.com Ted Yu
              Reporter:
              yuzhihong@gmail.com Ted Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: