Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5063

RegionServers fail to report to backup HMaster after primary goes down.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.92.0
    • 0.92.0
    • None
    • None
    • Reviewed

    Description

      1. Setup cluster with two HMasters
      2. Observe that HM1 is up and that all RS's are in the RegionServer list on web page.
      3. Kill (not even -9) the active HMaster
      4. Wait for ZK to time out (default 3 minutes).
      5. Observe that HM2 is now active. Tables may show up but RegionServers never report on web page. Existing connections are fine. New connections cannot find regionservers.

      Note:

      • If we replace a new HM1 in the same place and kill HM2, the cluster functions normally again after recovery. This sees to indicate that regionservers are stuck trying to talk to the old HM1.

      Attachments

        1. HBASE-5063.patch
          1 kB
          Jonathan Hsieh
        2. hbase-5063.v2.0.92.patch
          3 kB
          Jonathan Hsieh
        3. hbase-5063.v2.trunk.patch
          3 kB
          Jonathan Hsieh

        Activity

          People

            jmhsieh Jonathan Hsieh
            jmhsieh Jonathan Hsieh
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: