Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21325

Force to terminate regionserver when abort hang in somewhere

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha-1, 2.2.0, 2.1.1, 2.0.2
    • Fix Version/s: 3.0.0-alpha-1, 1.5.0, 2.2.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      Add two new config hbase.regionserver.abort.timeout and hbase.regionserver.abort.timeout.task. If regionserver abort timeout, it will schedule an abort timeout task to run. The default abort task is SystemExitWhenAbortTimeout, which will force to terminate region server when abort timeout. And you can config a special abort timeout task by hbase.regionserver.abort.timeout.task.
      Show
      Add two new config hbase.regionserver.abort.timeout and hbase.regionserver.abort.timeout.task. If regionserver abort timeout, it will schedule an abort timeout task to run. The default abort task is SystemExitWhenAbortTimeout, which will force to terminate region server when abort timeout. And you can config a special abort timeout task by hbase.regionserver.abort.timeout.task.

      Description

      When testing sync replication, I found that, if I transit the remote cluster to DA, while the local cluster is still in A, the region server will hang when shutdown. As the fsOk flag only test the local cluster(which is reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is broken(the remote wal directory is gone) so we will never succeed. And this lead to an infinite wait inside waitOnAllRegionsToClose.

      So I think here we should have an upper bound for the wait time in waitOnAllRegionsToClose method.

        Attachments

        1. HBASE-21325.master.001.patch
          6 kB
          Guanghao Zhang
        2. HBASE-21325.master.001.patch
          6 kB
          Guanghao Zhang
        3. HBASE-21325.master.002.patch
          8 kB
          Guanghao Zhang
        4. HBASE-21325.master.003.patch
          8 kB
          Guanghao Zhang
        5. HBASE-21325.master.004.patch
          8 kB
          Guanghao Zhang
        6. HBASE-21325.master.005.patch
          9 kB
          Guanghao Zhang

        Issue Links

          Activity

            People

            • Assignee:
              zghao Guanghao Zhang
              Reporter:
              zhangduo Duo Zhang

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment