HBase
  1. HBase
  2. HBASE-10924

[region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

    Details

      Description

      Observed behavior:
      In about 5% of cases, my rolling upgrade tests fail because of stuck regions during a region server unload. My theory is that this occurs when region assignment information changes between the time the region list is generated, and the time when the region is to be moved.

      An example of such a region information change is a split or merge.

      Example:
      Regionserver A has 100 regions (#0-#99). The balancer is turned off and the regionmover script is called to unload this regionserver. The regionmover script will generate the list of 100 regions to be moved and then proceed down that list, moving the regions off in series. However, there is a region, #84, that has split into two daughter regions while regions 0-83 were moved. The script will be stuck trying to move #84, timeout, and then the failure will bubble up (attempt 1 failed).

      Proposed solution:
      This specific failure mode should be caught and the region_mover script should now attempt to move off all the regions. Now, it will have 16+1 (due to split) regions to move. There is a good chance that it will be able to move all 17 off without issues. However, should it encounter this same issue (attempt 2 failed), it will retry again. This process will continue until the maximum number of unload retry attempts has been reached.

      This is not foolproof, but let's say for the sake of argument that 5% of unload attempts hit this issue, then with a retry count of 3, it will reduce the unload failure probability from 0.05 to 0.000125 (0.05^3).

      Next steps:
      I am looking for feedback on this approach. If it seems like a sensible approach, I will create a strawman patch and test it.

      1. HBASE-10924-0.94-v2.patch
        10 kB
        Aleksandr Shulman
      2. HBASE-10924-0.94-v3.patch
        10 kB
        Aleksandr Shulman

        Activity

        Hide
        Jean-Marc Spaggiari added a comment -

        Should we use this oportunity to move that into a java class instead of a rd script?

        Show
        Jean-Marc Spaggiari added a comment - Should we use this oportunity to move that into a java class instead of a rd script?
        Hide
        Aleksandr Shulman added a comment -

        That seems like a good place to put that logic since it'll be easier to maintain.
        As a bonus, we'll have implicit compatibility checks at compile time
        Only concern is that we don't break the shell api, but that shouldn't be difficult to maintain.

        Show
        Aleksandr Shulman added a comment - That seems like a good place to put that logic since it'll be easier to maintain. As a bonus, we'll have implicit compatibility checks at compile time Only concern is that we don't break the shell api, but that shouldn't be difficult to maintain.
        Hide
        Lars Hofhansl added a comment -

        Approach sounds good. We should also have an option to just ignore the failed region and move on (i.e. a best effort option).

        Show
        Lars Hofhansl added a comment - Approach sounds good. We should also have an option to just ignore the failed region and move on (i.e. a best effort option).
        Hide
        Aleksandr Shulman added a comment -

        Attaching v1 of the patch. For 94 only.

        Show
        Aleksandr Shulman added a comment - Attaching v1 of the patch. For 94 only.
        Hide
        Aleksandr Shulman added a comment -

        Attaching a better version of the patch here.
        It's relatively straightforward, but if there is interest in a formal review, I can put it up on RB.

        Testing: I ran this patch through an in-house rolling upgrade test framework. It performs MR jobs, splits, compactions, and DML while regions are moving.

        I also did some explicit testing by installing this on a cluster and moving regions back and forth while doing splits.

        The results were fine for all the testing.

        Show
        Aleksandr Shulman added a comment - Attaching a better version of the patch here. It's relatively straightforward, but if there is interest in a formal review, I can put it up on RB. Testing: I ran this patch through an in-house rolling upgrade test framework. It performs MR jobs, splits, compactions, and DML while regions are moving. I also did some explicit testing by installing this on a cluster and moving regions back and forth while doing splits. The results were fine for all the testing.
        Hide
        Aleksandr Shulman added a comment -

        Adding v3 which includes a sleep and some comments as to the rationale of the fix.

        Show
        Aleksandr Shulman added a comment - Adding v3 which includes a sleep and some comments as to the rationale of the fix.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12640899/HBASE-10924-0.94-v3.patch
        against trunk revision .
        ATTACHMENT ID: 12640899

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9341//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12640899/HBASE-10924-0.94-v3.patch against trunk revision . ATTACHMENT ID: 12640899 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9341//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment - - edited

        Is this an 0.94 issue specifically?

        Show
        Lars Hofhansl added a comment - - edited Is this an 0.94 issue specifically?
        Hide
        Aleksandr Shulman added a comment -

        This issue affects all branches. I will upload patches for the other branches as well.

        Show
        Aleksandr Shulman added a comment - This issue affects all branches. I will upload patches for the other branches as well.
        Hide
        Lars Hofhansl added a comment -

        I still think it might be better to let the mover finish all regions it can first. It would be great if it could fail fast on the split region.

        Show
        Lars Hofhansl added a comment - I still think it might be better to let the mover finish all regions it can first. It would be great if it could fail fast on the split region.
        Hide
        Lars Hofhansl added a comment -

        Pushing one more time.

        Show
        Lars Hofhansl added a comment - Pushing one more time.
        Hide
        Aleksandr Shulman added a comment -

        Hmm - I think it's still the intention of the patch to have region_mover do a best-effort move of all the regions, as the script had done before. The main addition is that it will retry that process a configurable number of times, in case of strange transient conditions we've seen, like the master down when the move request is sent.

        Overall, I've seen the region_mover work pretty well and I see this patch as just being a minor stability improvement. If you believe there is a better way to do this region movement, such as failing fast on a split region, I'd be happy to test such a patch in our frameworks.

        If we're happy with the logic of this patch, then I can post a version for 0.96, 0.98, trunk, etc.

        Show
        Aleksandr Shulman added a comment - Hmm - I think it's still the intention of the patch to have region_mover do a best-effort move of all the regions, as the script had done before. The main addition is that it will retry that process a configurable number of times, in case of strange transient conditions we've seen, like the master down when the move request is sent. Overall, I've seen the region_mover work pretty well and I see this patch as just being a minor stability improvement. If you believe there is a better way to do this region movement, such as failing fast on a split region, I'd be happy to test such a patch in our frameworks. If we're happy with the logic of this patch, then I can post a version for 0.96, 0.98, trunk, etc.
        Hide
        Lars Hofhansl added a comment -

        Looked at the patch. The comments in the code do not reflect the description of the issue here. It mentions ZK and Master failures, and then sleeps for 60s.
        The issues we've seen with region_mover were mostly related to it taking too long. This would make this worse, but then again our scenario may be special.

        What I would like is that if a region cannot be moved, it is simply ignored. HBase will eventually take of that region, region_mover should rather continue on to the next region and move as many regions as possible... Again might be special to us.

        Show
        Lars Hofhansl added a comment - Looked at the patch. The comments in the code do not reflect the description of the issue here. It mentions ZK and Master failures, and then sleeps for 60s. The issues we've seen with region_mover were mostly related to it taking too long. This would make this worse, but then again our scenario may be special. What I would like is that if a region cannot be moved, it is simply ignored. HBase will eventually take of that region, region_mover should rather continue on to the next region and move as many regions as possible... Again might be special to us.
        Hide
        Lars Hofhansl added a comment -

        Moved along for too long, seems nobody is really interested in this. Removing from 0.94. We can always bring it back.

        Show
        Lars Hofhansl added a comment - Moved along for too long, seems nobody is really interested in this. Removing from 0.94. We can always bring it back.

          People

          • Assignee:
            Aleksandr Shulman
            Reporter:
            Aleksandr Shulman
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Development