Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21744

timeout for server list refresh calls

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0-alpha-1, 2.2.0
    • Fix Version/s: None
    • Component/s: Zookeeper
    • Labels:
      None

      Description

      Not sure why yet, but we are seeing the case when cluster is in overall a bad state, where after RS dies and deletes its znode, the notification looks like it's lost, so the master doesn't detect the failure. ZK itself appears to be healthy and doesn't report anything special.
      After some other change is made to the server list, master rescans the list and picks up the stale change. Might make sense to add a config that would trigger the refresh if it hasn't happened for a while (e.g. 1 minute).

        Attachments

        1. HBASE-21744.patch
          5 kB
          Sergey Shelukhin
        2. HBASE-21744.01.patch
          10 kB
          Sergey Shelukhin
        3. HBASE-21744.02.patch
          10 kB
          Sergey Shelukhin
        4. HBASE-21744.03.patch
          11 kB
          Sergey Shelukhin
        5. HBASE-21744.04.patch
          11 kB
          Sergey Shelukhin

          Issue Links

            Activity

              People

              • Assignee:
                sershe Sergey Shelukhin
                Reporter:
                sershe Sergey Shelukhin
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: