Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20828 Finish-up AMv2 Design/List of Tenets/Specification of operation
  3. HBASE-21463

The checkOnlineRegionsReport can accidentally complete a TRSP

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1, 2.2.0
    • amv2
    • None
    • Reviewed

    Description

      On our testing cluster, we observe a race condition:
      1. A regionServerReport request is built
      2. A TRSP is scheduled to reopen the region
      3. The region is closed at RS side
      4. The OpenRegionProcedure is created
      5. The regionServerReport generated at step 1 is executed, and we find that the region is opened on the RS, so we update the region state to OPEN.
      6. The OpenRegionProcedure notices that the region has already been in the OPEN state so gives up and finishes.
      7. The TRSP finishes.
      8. The region is recorded as OPEN on the RS but actually not, and can not recover unless we use HBCK2.

      Attachments

        1. HBASE-21463-v2.patch
          36 kB
          Duo Zhang
        2. HBASE-21463-v1.patch
          36 kB
          Duo Zhang
        3. HBASE-21463.patch
          22 kB
          Duo Zhang
        4. HBASE-21463-UT.patch
          11 kB
          Duo Zhang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zhangduo Duo Zhang
            zhangduo Duo Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment