Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12467

Master joins cluster but never completes initialization

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0.0, 0.98.9
    • Component/s: master
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      While diagnosing a rare failure in IntegrationTestLoadAndVerify, I discovered this scenario. Master was restarted by CM. Upon rejoining the cluster it successfully assumes responsibility as active master, but apparently the finishInitialization method never completes. The last log line from that thread is

      2014-11-10 17:01:29,940 INFO  [master:ip-172-31-9-135:60000] master.HMaster: hbase:meta with replicaId 0 assigned=0, rit=false, location=ip-172-31-9-136.ec2.internal,60020,1415638551951
      

      I see region states populated from existing znodes. AM inventoried the online regions, acknowledged that this was master failover. There it sits, responding to RPC's with PleaseHoldException: Master is initializing.

      For the sake of resiliency, we should detect this scenario and at least release control as active master.

        Attachments

        1. HBASE-12467.00.patch
          5 kB
          Nick Dimiduk
        2. HBASE-12467.00.patch
          4 kB
          Nick Dimiduk
        3. HBASE-12467.01.patch
          5 kB
          Nick Dimiduk
        4. HBASE-12467.01.patch
          6 kB
          Nick Dimiduk

          Activity

            People

            • Assignee:
              ndimiduk Nick Dimiduk
              Reporter:
              ndimiduk Nick Dimiduk
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: