Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-9441

Intermittent TestRegionPlacement#testRegionPlacement failure

    XMLWordPrintableJSON

Details

    • Test
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.98.0, 0.96.0
    • None
    • None
    • Reviewed

    Description

      Occasionally TestRegionPlacement#testRegionPlacement fails due to no region found on secondary / tertiary region server.
      In the following test output, 10359edcfb1a761f94ab6d5b559c2278 was the region from the killed region server:

      2013-09-04 23:28:25,127 INFO  [pool-1-thread-1] util.FSUtils(1878): Scan DFS for locality info takes 38 ms
      Region Assignment Verification for Table: testRegionAssignment
              Total regions : 10
              Total regions on favored nodes 10
                      Total regions on PRIMARY region servers: 10
                      Total regions on SECONDARY region servers: 0
                      Total regions on TERTIARY region servers: 0
      ...
      2013-09-04 23:28:25,130 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(312): master:37502-0x140eb4dc4ef0000 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/region-in-transition/10359edcfb1a761f94ab6d5b559c2278
      2013-09-04 23:28:25,130 DEBUG [RS_OPEN_REGION-kiyo:49841-1] zookeeper.ZKAssign(861): regionserver:49841-0x140eb4dc4ef0007 Transitioned node 10359edcfb1a761f94ab6d5b559c2278 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
      2013-09-04 23:28:25,131 DEBUG [RS_OPEN_REGION-kiyo:49841-1] handler.OpenRegionHandler(386): Transitioned 10359edcfb1a761f94ab6d5b559c2278 to OPENED in zk on kiyo.gq1.ygridcore.net,49841,1378337278470
      2013-09-04 23:28:25,131 DEBUG [RS_OPEN_REGION-kiyo:49841-1] handler.OpenRegionHandler(187): Opened hbase:namespace,,1378337284888.10359edcfb1a761f94ab6d5b559c2278. on kiyo.gq1.ygridcore.net,49841,1378337278470
      2013-09-04 23:28:25,132 DEBUG [AM.ZK.Worker-pool2-t6] master.AssignmentManager(801): Handling transition=RS_ZK_REGION_OPENED, server=kiyo.gq1.ygridcore.net,49841,1378337278470, region=10359edcfb1a761f94ab6d5b559c2278, current state from region state map ={10359edcfb1a761f94ab6d5b559c2278 state=OPENING, ts=1378337305025, server=kiyo.gq1.ygridcore.net,49841,1378337278470}
      2013-09-04 23:28:25,132 INFO  [AM.ZK.Worker-pool2-t6] master.RegionStates(263): Transitioned from {10359edcfb1a761f94ab6d5b559c2278 state=OPENING, ts=1378337305025, server=kiyo.gq1.ygridcore.net,49841,1378337278470} to {10359edcfb1a761f94ab6d5b559c2278 state=OPEN, ts=1378337305132, server=kiyo.gq1.ygridcore.net,49841,1378337278470}
      2013-09-04 23:28:25,132 DEBUG [AM.ZK.Worker-pool2-t6] handler.OpenedRegionHandler(150): Handling OPENED event for 10359edcfb1a761f94ab6d5b559c2278 from kiyo.gq1.ygridcore.net,49841,1378337278470; deleting unassigned node
      2013-09-04 23:28:25,132 DEBUG [AM.ZK.Worker-pool2-t6] zookeeper.ZKAssign(405): master:37502-0x140eb4dc4ef0000 Deleting existing unassigned node 10359edcfb1a761f94ab6d5b559c2278 in expected state RS_ZK_REGION_OPENED
      2013-09-04 23:28:25,188 INFO  [pool-1-thread-1] hbase.ResourceChecker(171): after: master.TestRegionPlacement#testRegionPlacement Thread=873 (was 814)
      Potentially hanging thread: RS_OPEN_REGION-kiyo:53373-2
      ...
       - Thread LEAK? -, OpenFileDescriptor=1061 (was 1033) - OpenFileDescriptor LEAK? -, MaxFileDescriptor=32768 (was 32768), SystemLoadAverage=549 (was 609), ProcessCount=489 (was 489), AvailableMemoryMB=9288 (was 9107) - AvailableMemoryMB LEAK? -, ConnectionCount=23 (was 22) - ConnectionCount LEAK? -
      2013-09-04 23:28:25,189 WARN  [pool-1-thread-1] hbase.ResourceChecker(134): Thread=873 is superior to 500
      2013-09-04 23:28:25,190 WARN  [pool-1-thread-1] hbase.ResourceChecker(134): OpenFileDescriptor=1061 is superior to 1024
      2013-09-04 23:28:25,223 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(312): master:37502-0x140eb4dc4ef0000 Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/region-in-transition/10359edcfb1a761f94ab6d5b559c2278
      2013-09-04 23:28:25,224 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(312): master:37502-0x140eb4dc4ef0000 Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/region-in-transition
      2013-09-04 23:28:25,224 DEBUG [AM.ZK.Worker-pool2-t6] zookeeper.ZKAssign(434): master:37502-0x140eb4dc4ef0000 Deleted unassigned node 10359edcfb1a761f94ab6d5b559c2278 in expected state RS_ZK_REGION_OPENED
      2013-09-04 23:28:25,225 DEBUG [AM.ZK.Worker-pool2-t6] master.AssignmentManager$4(1276): Znode hbase:namespace,,1378337284888.10359edcfb1a761f94ab6d5b559c2278. deleted, state: {10359edcfb1a761f94ab6d5b559c2278 state=OPEN, ts=1378337305132, server=kiyo.gq1.ygridcore.net,49841,1378337278470}
      2013-09-04 23:28:25,225 INFO  [AM.ZK.Worker-pool2-t6] master.RegionStates(299): Onlined 10359edcfb1a761f94ab6d5b559c2278 on kiyo.gq1.ygridcore.net,49841,1378337278470
      

      Looks like region placement report was obtained when the region was still in transition.

      Attachments

        1. org.apache.hadoop.hbase.master.TestRegionPlacement-output.txt
          402 kB
          Ted Yu
        2. 9441-v6.txt
          5 kB
          Ted Yu
        3. 9441-v5.txt
          6 kB
          Ted Yu
        4. 9441-v4.txt
          5 kB
          Ted Yu
        5. 9441-v3.txt
          4 kB
          Ted Yu
        6. 9441-v2.txt
          5 kB
          Ted Yu
        7. 9441-v1.txt
          3 kB
          Ted Yu

        Activity

          People

            yuzhihong@gmail.com Ted Yu
            yuzhihong@gmail.com Ted Yu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: