Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-4416

OpenedRegionHandler running for a dead assignment will kill the master

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • 0.90.4
    • 0.90.5
    • None
    • None

    Description

      It goes like this:

      • Master balances region R from server A to server B
      • B reports OPENED of R
      • Master queues a OpenedRegionHandler for R on B
      • B dies (sad)
      • Master processes the splits and reassigns R to C
      • C reports OPENED of R
      • Master queues a OpenedRegionHandler for R on C
      • OpenedRegionHandler for R on B is finally processed, but leaves the region in a weird state (log #1)
      • OpenedRegionHandler for R on C is processed, fails when it tries to delete the znode, kills the master (log #2)

      If the master didn't commit seppuku, it would have had a wrong view of the state of that region because it wouldn't link it to the region server that really opened it in the end.

      I'm not sure how I would go fixing this though...

      Log 1:

      2011-09-15 01:57:47,430 INFO org.apache.hadoop.hbase.master.AssignmentManager: The server is not in online servers, ServerName=sv4r23s44,60020,1316076811761, region=6e22b45f4288ea4d73f612ccf111aea6
      2011-09-15 01:57:47,430 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 6e22b45f4288ea4d73f612ccf111aea6 on sv4r23s44,60020,1316076811761

      Log 2:

      2011-09-15 01:58:10,171 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x231d7b21aba0480 Deleting existing unassigned node for 6e22b45f4288ea4d73f612ccf111aea6 that is in expected state RS_ZK_REGION_OPENED
      2011-09-15 01:58:10,204 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x231d7b21aba0480 Unable to get data of znode /hbase/unassigned/6e22b45f4288ea4d73f612ccf111aea6 because node does not exist (not necessarily an error)
      2011-09-15 01:58:10,204 FATAL org.apache.hadoop.hbase.master.HMaster: Error deleting OPENED node in ZK for transition ZK node (6e22b45f4288ea4d73f612ccf111aea6)

      Attachments

        Activity

          People

            Unassigned Unassigned
            jdcryans Jean-Daniel Cryans
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: