Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12769

Replication fails to delete all corresponding zk nodes when peer is removed

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.99.2
    • 1.3.0, 2.0.0
    • Replication
    • None
    • Reviewed

    Description

      When removing a peer, the client side will delete peerId under peersZNode node; then alive region servers will be notified and delete corresponding hlog queues under its rsZNode of replication. However, if there are failed servers whose hlog queues have not been transferred by alive servers(this likely happens if setting a big value to "replication.sleep.before.failover" and lots of region servers restarted), these hlog queues won't be deleted after the peer is removed. I think remove_peer should guarantee all corresponding zk nodes have been removed after it completes; otherwise, if we create a new peer with the same peerId with the removed one, there might be unexpected data to be replicated.

      Attachments

        1. 12769-branch-1-v5.txt
          25 kB
          Ted Yu
        2. 12769-v6.txt
          25 kB
          Jianwei Cui
        3. 12769-v5.txt
          25 kB
          Jianwei Cui
        4. 12769-v4.txt
          25 kB
          Ted Yu
        5. 12769-v3.txt
          25 kB
          Ted Yu
        6. 12769-v2.txt
          25 kB
          Ted Yu
        7. HBASE-12769-trunk-v1.patch
          19 kB
          Jianwei Cui
        8. HBASE-12769-trunk-v0.patch
          25 kB
          Jianwei Cui

        Issue Links

          Activity

            People

              cuijianwei Jianwei Cui
              cuijianwei Jianwei Cui
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: