HBase
  1. HBase
  2. HBASE-5970

Improve the AssignmentManager#updateTimer and speed up handling opened event

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.95.0
    • Component/s: master
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We found handing opened event very slow in the environment with lots of regions.

      The problem is the slow AssignmentManager#updateTimer.

      We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours.

      2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 100000 region(s) round-robin across 5 server(s)
      2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done

      I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work.
      After the improvement, it took only 4.5mins

      2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 100000 region(s) across 5 server(s), retainAssignment=true
      2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done

      1. HBASE-5970v4.patch
        4 kB
        chunhui shen
      2. HBASE-5970v4.patch
        4 kB
        chunhui shen
      3. 5970v3.patch
        4 kB
        Ted Yu
      4. HBASE-5970v3.patch
        4 kB
        chunhui shen
      5. HBASE-5970v2.patch
        3 kB
        chunhui shen
      6. HBASE-5970.patch
        4 kB
        chunhui shen

        Issue Links

          Activity

          chunhui shen created issue -
          chunhui shen made changes -
          Field Original Value New Value
          Attachment HBASE-5970.patch [ 12526101 ]
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          HBASE-5337-AM.updateTimers() delays the timeout monitor from assigning regions.
          I think i can duplicate it with this defect.
          Few comments
          serversInUpdatingTimer can be changed to updateTimerForServer?
          The order in which the updation should happen matters?

          Threads.setDaemonThreadRunning(timerUpdater.getThread(),
          +        master.getServerName() + ".timerUpdater");
          

          Can we just add this to one protected method. Actually we found that doing like that will help you in mocking the master in case of testing these scenarios?

          Patch looks good.

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui HBASE-5337 -AM.updateTimers() delays the timeout monitor from assigning regions. I think i can duplicate it with this defect. Few comments serversInUpdatingTimer can be changed to updateTimerForServer? The order in which the updation should happen matters? Threads.setDaemonThreadRunning(timerUpdater.getThread(), + master.getServerName() + ".timerUpdater" ); Can we just add this to one protected method. Actually we found that doing like that will help you in mocking the master in case of testing these scenarios? Patch looks good.
          ramkrishna.s.vasudevan made changes -
          Link This issue is duplicated by HBASE-5337 [ HBASE-5337 ]
          Hide
          chunhui shen added a comment -

          @ram
          Thanks for the review.
          Let's see what others say, and I'll do the modify together.

          Show
          chunhui shen added a comment - @ram Thanks for the review. Let's see what others say, and I'll do the modify together.
          Hide
          Ted Yu added a comment -
          +          serverToUpdateTimer = serversInUpdatingTimer
          +              .higher(serverToUpdateTimer);
          +        }
          +        if (serverToUpdateTimer == null) {
          +          break;
          

          What would happen if a 'lower' ServerName gets its timers updated, gets removed from serversInUpdatingTimer and later is added back to serversInUpdatingTimer ?
          How would this server be picked up ?

          Show
          Ted Yu added a comment - + serverToUpdateTimer = serversInUpdatingTimer + .higher(serverToUpdateTimer); + } + if (serverToUpdateTimer == null ) { + break ; What would happen if a 'lower' ServerName gets its timers updated, gets removed from serversInUpdatingTimer and later is added back to serversInUpdatingTimer ? How would this server be picked up ?
          Hide
          chunhui shen added a comment -

          What would happen if a 'lower' ServerName gets its timers updated, gets removed from serversInUpdatingTimer and later is added back to serversInUpdatingTimer ?

          If later it is added back to serversInUpdatingTimer, we will update timer again after next chore()

          Show
          chunhui shen added a comment - What would happen if a 'lower' ServerName gets its timers updated, gets removed from serversInUpdatingTimer and later is added back to serversInUpdatingTimer ? If later it is added back to serversInUpdatingTimer, we will update timer again after next chore()
          Hide
          Ted Yu added a comment -

          That's right. But I wonder why we need to wait for the next chore() ?

          Show
          Ted Yu added a comment - That's right. But I wonder why we need to wait for the next chore() ?
          Ted Yu made changes -
          Hadoop Flags Reviewed [ 10343 ]
          Description We found handing opened event very slow in the environment with lots of regions.

          The problem is the slow AssignmentManager#updateTimer.

          We do the test for bulk assigning 10w regions, the whole process of bulk assigning took 1 hours.

          2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 100000 region(s) round-robin across 5 server(s)
          2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done

          I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work.
          After the improvement, it took only 4.5mins

          2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 100000 region(s) across 5 server(s), retainAssignment=true
          2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done
          We found handing opened event very slow in the environment with lots of regions.

          The problem is the slow AssignmentManager#updateTimer.

          We do the test for bulk assigning 10w (i.e. 100k) regions, the whole process of bulk assigning took 1 hours.

          2012-05-06 20:31:49,201 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 100000 region(s) round-robin across 5 server(s)
          2012-05-06 21:26:32,103 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done

          I think we could do the improvement for the AssignmentManager#updateTimer: Make a thread do this work.
          After the improvement, it took only 4.5mins

          2012-05-07 11:03:36,581 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 100000 region(s) across 5 server(s), retainAssignment=true
          2012-05-07 11:07:57,073 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done
          Hide
          chunhui shen added a comment -

          I think we don't need to update the timer so active.
          Waiting for the next chore() could reduce much consume if there are lots of opened event, because they may update the same server.

          Show
          chunhui shen added a comment - I think we don't need to update the timer so active. Waiting for the next chore() could reduce much consume if there are lots of opened event, because they may update the same server.
          Hide
          Ted Yu added a comment -

          The above assumption is reasonable.

          Show
          Ted Yu added a comment - The above assumption is reasonable.
          Hide
          Ming Ma added a comment -

          Looks good. Why do we need "true" in "while (true && !stopper.isStopped())"?

          Show
          Ming Ma added a comment - Looks good. Why do we need "true" in "while (true && !stopper.isStopped())"?
          Hide
          chunhui shen added a comment -

          Why do we need "true" in "while (true && !stopper.isStopped())"?

          Sorry, make such a mistake~~~~, correct it in patch v2.

          Thanks for the review

          Show
          chunhui shen added a comment - Why do we need "true" in "while (true && !stopper.isStopped())"? Sorry, make such a mistake~~~~, correct it in patch v2. Thanks for the review
          chunhui shen made changes -
          Attachment HBASE-5970v2.patch [ 12526281 ]
          Hide
          chunhui shen added a comment -

          In patchv3, change

          while (!stopper.isStopped())

          to

          while (!serversInUpdatingTimer.isEmpty() && !stopper.isStopped())
          Show
          chunhui shen added a comment - In patchv3, change while (!stopper.isStopped()) to while (!serversInUpdatingTimer.isEmpty() && !stopper.isStopped())
          chunhui shen made changes -
          Attachment HBASE-5970v3.patch [ 12526715 ]
          Nicolas Liochon made changes -
          Link This issue is related to HBASE-5843 [ HBASE-5843 ]
          Hide
          chunhui shen added a comment -

          @stack
          Could you take a see for this issue.
          In our tests, it could speed up handling opened event if there are lots of regions when cluster start up:
          For bulk assigning 10w (i.e. 100k) regions, reduce the time from 1 hours to 4.5 mins

          Show
          chunhui shen added a comment - @stack Could you take a see for this issue. In our tests, it could speed up handling opened event if there are lots of regions when cluster start up: For bulk assigning 10w (i.e. 100k) regions, reduce the time from 1 hours to 4.5 mins
          Ted Yu made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12526715/HBASE-5970v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526715/HBASE-5970v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1872//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Re-attaching patch v3.

          Show
          Ted Yu added a comment - Re-attaching patch v3.
          Ted Yu made changes -
          Attachment 5970v3.patch [ 12527518 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12527518/5970v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12527518/5970v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1877//console This message is automatically generated.
          Hide
          stack added a comment -

          I like the effect this patch has. You are addressing this comment in updateTimers?

              // This loop could be expensive.
          

          Should this be in the new timer class?

          +  private final ConcurrentSkipListSet<ServerName> serversInUpdatingTimer = 
          +    new ConcurrentSkipListSet<ServerName>();
          

          Is this javadoc right?

          +   * Add the server to serversInUpdatingTimer, and wait {@link TimerUpdater} to
          +   * update timers for all regions in transition going against this server.
          

          Why is doing this work in the background faster? Is it just that it being inline, it takes a noticeable amount of time?

          This is a nice improvement.

          Show
          stack added a comment - I like the effect this patch has. You are addressing this comment in updateTimers? // This loop could be expensive. Should this be in the new timer class? + private final ConcurrentSkipListSet<ServerName> serversInUpdatingTimer = + new ConcurrentSkipListSet<ServerName>(); Is this javadoc right? + * Add the server to serversInUpdatingTimer, and wait {@link TimerUpdater} to + * update timers for all regions in transition going against this server. Why is doing this work in the background faster? Is it just that it being inline, it takes a noticeable amount of time? This is a nice improvement.
          Hide
          chunhui shen added a comment -

          Why is doing this work in the background faster? Is it just that it being inline, it takes a noticeable amount of time?

          AssignmentManager#updateTimers would check each region in RIT, and there are lots of regions in the RIT when startup, so it will took much time(about 30ms if 100k regions in RIT). However, in the current logic, we will do the AssignmentManager#updateTimers for each opened event, causing the whole process of handling all the opened events tooks much much time.

          If we do the updateTimers in the background, we needn't wait it when handling opened event. Also, we wouldn't updateTimers for the same Regionserver in a short time.(In fact, when cluster startup , lots of opened events from the same regionserver at the same time, we needn't do this work many times at the moment)

          Show
          chunhui shen added a comment - Why is doing this work in the background faster? Is it just that it being inline, it takes a noticeable amount of time? AssignmentManager#updateTimers would check each region in RIT, and there are lots of regions in the RIT when startup, so it will took much time(about 30ms if 100k regions in RIT). However, in the current logic, we will do the AssignmentManager#updateTimers for each opened event, causing the whole process of handling all the opened events tooks much much time. If we do the updateTimers in the background, we needn't wait it when handling opened event. Also, we wouldn't updateTimers for the same Regionserver in a short time.(In fact, when cluster startup , lots of opened events from the same regionserver at the same time, we needn't do this work many times at the moment)
          Hide
          chunhui shen added a comment -

          @stack

          You are addressing this comment in updateTimers?

          // This loop could be expensive.

          I keep this loop, but greatly reduce the call times

          Is this javadoc right?

          I'm not sure, could you give a suggestion.

          Thanks for the review

          Show
          chunhui shen added a comment - @stack You are addressing this comment in updateTimers? // This loop could be expensive. I keep this loop, but greatly reduce the call times Is this javadoc right? I'm not sure, could you give a suggestion. Thanks for the review
          Hide
          Nicolas Liochon added a comment -

          Hi,

          Could you share the logs of the tests? I would be interested to have a look at them.
          The javadoc for updateTimers says it's not used for bulk assignment, is there a mix of regions 'bulk assigned' and other regions?
          I see as well in the description that the time was once with 'retainAssignment=true' and once without. Are the results comparable in both cases?

          Thank you!

          Show
          Nicolas Liochon added a comment - Hi, Could you share the logs of the tests? I would be interested to have a look at them. The javadoc for updateTimers says it's not used for bulk assignment, is there a mix of regions 'bulk assigned' and other regions? I see as well in the description that the time was once with 'retainAssignment=true' and once without. Are the results comparable in both cases? Thank you!
          Hide
          chunhui shen added a comment -

          @nkeywal
          In the current bulk assign, we also put plans to AssignmentManager.regionPlans to prevent multi assign.
          So, we would also use updateTimers in the bulk assign.

          Show
          chunhui shen added a comment - @nkeywal In the current bulk assign, we also put plans to AssignmentManager.regionPlans to prevent multi assign. So, we would also use updateTimers in the bulk assign.
          Hide
          stack added a comment -

          What do we need to do to get this patch in? Can you update it for trunk please Chunhui so we can rerun it by hadoopqa.

          On the javadoc suggestion? I'm not sure what you'd write but write something that makes sense (the javadoc as is is nonesensical).

          Show
          stack added a comment - What do we need to do to get this patch in? Can you update it for trunk please Chunhui so we can rerun it by hadoopqa. On the javadoc suggestion? I'm not sure what you'd write but write something that makes sense (the javadoc as is is nonesensical).
          Hide
          stack added a comment -

          Upping priority. This helps w/ MTTR.

          Show
          stack added a comment - Upping priority. This helps w/ MTTR.
          stack made changes -
          Priority Major [ 3 ] Critical [ 2 ]
          Hide
          chunhui shen added a comment -

          Update patch for current trunk, modify the javadoc for the method AssignmentManager#addToServersInUpdatingTimer

          Show
          chunhui shen added a comment - Update patch for current trunk, modify the javadoc for the method AssignmentManager#addToServersInUpdatingTimer
          chunhui shen made changes -
          Attachment HBASE-5970v4.patch [ 12530318 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12530318/HBASE-5970v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.master.TestAssignmentManager

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2068//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2068//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530318/HBASE-5970v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestAssignmentManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2068//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2068//console This message is automatically generated.
          Hide
          stack added a comment -

          Chunhui Do you think the failure in AM because of this patch? Should we retry it against hadoopqa?

          I was wondering what bound we have on the number of threads when splitting out on the regionserver? It should be low by default I'd say.

          Good stuff.

          Show
          stack added a comment - Chunhui Do you think the failure in AM because of this patch? Should we retry it against hadoopqa? I was wondering what bound we have on the number of threads when splitting out on the regionserver? It should be low by default I'd say. Good stuff.
          Hide
          chunhui shen added a comment -

          Reattaching patch v4

          Show
          chunhui shen added a comment - Reattaching patch v4
          chunhui shen made changes -
          Attachment HBASE-5970v4.patch [ 12530329 ]
          Hide
          chunhui shen added a comment -

          I was wondering what bound we have on the number of threads when splitting out on the regionserver? It should be low by default I'd say.

          Sorry, Stack, I'm not clear about this, is it related to this issue?

          Show
          chunhui shen added a comment - I was wondering what bound we have on the number of threads when splitting out on the regionserver? It should be low by default I'd say. Sorry, Stack, I'm not clear about this, is it related to this issue?
          Hide
          stack added a comment -

          Sorry, Stack, I'm not clear about this, is it related to this issue?

          It is. I found the answer reviewing the last patch up on rb (answer is 3).

          Thanks

          Show
          stack added a comment - Sorry, Stack, I'm not clear about this, is it related to this issue? It is. I found the answer reviewing the last patch up on rb (answer is 3). Thanks
          Hide
          chunhui shen added a comment -

          I run the failed test case(org.apache.hadoop.hbase.master.TestAssignmentManager) and passed on the local PC.

          Also I think I found the reason why it failed: caused by TestAssignmentManager#testRegionPlanIsUpdatedWhenRegionFailsToOpen

          At the last of testRegionPlanIsUpdatedWhenRegionFailsToOpen, am will handle region with the state RS_ZK_REGION_FAILED_OPEN,

          in the AssignmentManager#handleRegion()

          case RS_ZK_REGION_FAILED_OPEN:
          ...
          this.executorService.submit(new ClosedRegionHandler(master,
                      this, regionState.getRegion()));
          

          so we call a thread to execute ClosedRegionHandler in the background, and it create a node after
          testRegionPlanIsUpdatedWhenRegionFailsToOpen execute

          public void after() throws KeeperException {
              if (this.watcher != null) {
                // Clean up all znodes
                ZKAssign.deleteAllNodes(this.watcher);
                this.watcher.close();
              }
            }

          Hence, it will fail with a probability and nothing with this patch.

          Show
          chunhui shen added a comment - I run the failed test case(org.apache.hadoop.hbase.master.TestAssignmentManager) and passed on the local PC. Also I think I found the reason why it failed: caused by TestAssignmentManager#testRegionPlanIsUpdatedWhenRegionFailsToOpen At the last of testRegionPlanIsUpdatedWhenRegionFailsToOpen, am will handle region with the state RS_ZK_REGION_FAILED_OPEN, in the AssignmentManager#handleRegion() case RS_ZK_REGION_FAILED_OPEN: ... this .executorService.submit( new ClosedRegionHandler(master, this , regionState.getRegion())); so we call a thread to execute ClosedRegionHandler in the background, and it create a node after testRegionPlanIsUpdatedWhenRegionFailsToOpen execute public void after() throws KeeperException { if ( this .watcher != null ) { // Clean up all znodes ZKAssign.deleteAllNodes( this .watcher); this .watcher.close(); } } Hence, it will fail with a probability and nothing with this patch.
          Hide
          stack added a comment -

          Excellent Chunhui. Can you file a new issue w/ what you found on AM (any chance of a patch to fix it?)

          Also, you are correct above when you think my question about 'number of threads' belonged to another issue. Pardon my being confused.

          Let me commit this patch.

          Show
          stack added a comment - Excellent Chunhui. Can you file a new issue w/ what you found on AM (any chance of a patch to fix it?) Also, you are correct above when you think my question about 'number of threads' belonged to another issue. Pardon my being confused. Let me commit this patch.
          stack made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 0.96.0 [ 12320040 ]
          Resolution Fixed [ 1 ]
          Hide
          stack added a comment -

          Committed to trunk. Thanks for the patch Chunhui.

          Show
          stack added a comment - Committed to trunk. Thanks for the patch Chunhui.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12530329/HBASE-5970v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.master.TestSplitLogManager

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2071//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2071//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530329/HBASE-5970v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2071//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2071//console This message is automatically generated.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          As mentioned in that HBASE-5337 now when we start updating the timers at different times if suppose we introduce timeoutmonitor again to a lower value like 5 min then the update timers overall does not help becuase it delays the timeout monitor from acting after 5 mins. Just adding this so that later if we try to bring back the TM then we have to recheck this.
          Nice work Chunhui.

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui As mentioned in that HBASE-5337 now when we start updating the timers at different times if suppose we introduce timeoutmonitor again to a lower value like 5 min then the update timers overall does not help becuase it delays the timeout monitor from acting after 5 mins. Just adding this so that later if we try to bring back the TM then we have to recheck this. Nice work Chunhui.
          Hide
          chunhui shen added a comment -

          @ram

          it delays the timeout monitor from acting after 5 mins

          Yes, but we will delays the timeout monitor at most 10s( period is 10s as default), isn't it acceptable?

          this.timerUpdater = new TimerUpdater(conf.getInt(
          +        "hbase.master.assignment.timerupdater.period", 10000), master);
          
          Show
          chunhui shen added a comment - @ram it delays the timeout monitor from acting after 5 mins Yes, but we will delays the timeout monitor at most 10s( period is 10s as default), isn't it acceptable? this .timerUpdater = new TimerUpdater(conf.getInt( + "hbase.master.assignment.timerupdater.period" , 10000), master);
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          What i meant is overall the updatetimers itself affects the TM. The problem is updatetimer just updates all the RITs time with the current time.
          Suppose the TM saw the RIT at time x. Now if the timeout monitor is 5 mins. Exactly after x+5 we expect the TM to reassign. But the updatetimer will update all the regions in RIT. So even this region who should be assinged after x+5 will also get updated to the current time.
          So now the TM will be able to take action only after x+5+Delta. Currently as it is 30 mins we wont be much affected as no one will be waiting for 30 mins.

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui What i meant is overall the updatetimers itself affects the TM. The problem is updatetimer just updates all the RITs time with the current time. Suppose the TM saw the RIT at time x. Now if the timeout monitor is 5 mins. Exactly after x+5 we expect the TM to reassign. But the updatetimer will update all the regions in RIT. So even this region who should be assinged after x+5 will also get updated to the current time. So now the TM will be able to take action only after x+5+Delta. Currently as it is 30 mins we wont be much affected as no one will be waiting for 30 mins.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2963 (See https://builds.apache.org/job/HBase-TRUNK/2963/)
          HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event (Revision 1344569)

          Result = FAILURE
          stack :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2963 (See https://builds.apache.org/job/HBase-TRUNK/2963/ ) HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event (Revision 1344569) Result = FAILURE stack : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Hide
          chunhui shen added a comment -

          So now the TM will be able to take action only after x+5+Delta

          We use updatetimers to prevent unnecessary multi assign region, it just updates the RITs time as the related server.

          Comparing to previous logic, here Delta is at most 10s.

          Anyway, multi assign is much more serious than delayed assign by TimeoutMonitor,

          And, we should try our best to ensure that there is no case to need use TimeoutMonitor assign region.

          Correct me if wrong, thanks

          Show
          chunhui shen added a comment - So now the TM will be able to take action only after x+5+Delta We use updatetimers to prevent unnecessary multi assign region, it just updates the RITs time as the related server. Comparing to previous logic, here Delta is at most 10s. Anyway, multi assign is much more serious than delayed assign by TimeoutMonitor, And, we should try our best to ensure that there is no case to need use TimeoutMonitor assign region. Correct me if wrong, thanks
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui

          And, we should try our best to ensure that there is no case to need use TimeoutMonitor assign region.

          Yes. This is what we are also trying to achieve. Many of our issues is also targeted on that.

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui And, we should try our best to ensure that there is no case to need use TimeoutMonitor assign region. Yes. This is what we are also trying to achieve. Many of our issues is also targeted on that.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #34 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/34/)
          HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event (Revision 1344569)

          Result = FAILURE
          stack :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #34 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/34/ ) HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event (Revision 1344569) Result = FAILURE stack : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          I have a question here. We tried this patch on 0.94 with 20000 regions and 4 RS. The scenario we tried was to disable and enable a table that had 20000 regions.
          We did not see much improvement. Do you see any specific scenario where we can get improvement? Thanks.

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui I have a question here. We tried this patch on 0.94 with 20000 regions and 4 RS. The scenario we tried was to disable and enable a table that had 20000 regions. We did not see much improvement. Do you see any specific scenario where we can get improvement? Thanks.
          Hide
          chunhui shen added a comment -

          @ram
          How much time did you took to enable 20000 regions, if regions is not very much, master handle fast enough, could you test with 100,000 regions?
          In our testing, RS open region fast, but master handle opened event slowly, and find 100,000 regions much much slower than 50,000 regions

          Another reason, recently we found RS open regions very slowly after 0.92 if one table has much regions

          You could see the following code to get the detail

          public RegionOpeningState openRegion(HRegionInfo region, int versionOfOfflineNode){
          ...
          HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
          ...
          
          public HTableDescriptor get(final String tablename){
          ...
          long modtime = getTableInfoModtime(this.fs, this.rootdir, tablename);
          ...
          }
          }
          

          getTableInfoModtime->getTableInfoPath->getTableInfoPath->FSUtils.listStatus()

          if one table has much regions, FSUtils.listStatus() will take much time, and opening region in parallel on the rs will be closed to serially.

          So maybe we should do the improvement for above code.

          Show
          chunhui shen added a comment - @ram How much time did you took to enable 20000 regions, if regions is not very much, master handle fast enough, could you test with 100,000 regions? In our testing, RS open region fast, but master handle opened event slowly, and find 100,000 regions much much slower than 50,000 regions Another reason, recently we found RS open regions very slowly after 0.92 if one table has much regions You could see the following code to get the detail public RegionOpeningState openRegion(HRegionInfo region, int versionOfOfflineNode){ ... HTableDescriptor htd = this .tableDescriptors.get(region.getTableName()); ... public HTableDescriptor get( final String tablename){ ... long modtime = getTableInfoModtime( this .fs, this .rootdir, tablename); ... } } getTableInfoModtime->getTableInfoPath->getTableInfoPath->FSUtils.listStatus() if one table has much regions, FSUtils.listStatus() will take much time, and opening region in parallel on the rs will be closed to serially. So maybe we should do the improvement for above code.
          Hide
          Ted Yu added a comment -

          @Chunhui:
          You can open a new issue for improving the above code.

          Show
          Ted Yu added a comment - @Chunhui: You can open a new issue for improving the above code.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          I think in trunk some change has been done in the code that you are suggesting by N?
          https://issues.apache.org/jira/browse/HBASE-5998.

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui I think in trunk some change has been done in the code that you are suggesting by N? https://issues.apache.org/jira/browse/HBASE-5998 .
          Sergey Shelukhin made changes -
          Link This issue relates to HBASE-7038 [ HBASE-7038 ]
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #560 (See https://builds.apache.org/job/HBase-0.94/560/)
          HBASE-7038 Port HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event to 0.94 (Sergey Shelukhin) (Revision 1403787)

          Result = SUCCESS
          tedyu :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #560 (See https://builds.apache.org/job/HBase-0.94/560/ ) HBASE-7038 Port HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event to 0.94 (Sergey Shelukhin) (Revision 1403787) Result = SUCCESS tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Hide
          yang ming added a comment -

          chunhui shen
          I have tried this patch on 0.94.2 with 100,000 regions(one empty table) and 4RS.I restarted the cluster,but found handing opened event still very slow.
          I did not see any improvement,can you show us your test environment?
          Thanks.

          Show
          yang ming added a comment - chunhui shen I have tried this patch on 0.94.2 with 100,000 regions(one empty table) and 4RS.I restarted the cluster,but found handing opened event still very slow. I did not see any improvement,can you show us your test environment? Thanks.
          Hide
          yang ming added a comment -

          chunhui shen
          Or does this path depend on other modification? Thanks.

          Show
          yang ming added a comment - chunhui shen Or does this path depend on other modification? Thanks.
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security-on-Hadoop-23 #9 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/9/)
          HBASE-7038 Port HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event to 0.94 (Sergey Shelukhin) (Revision 1403787)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #9 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/9/ ) HBASE-7038 Port HBASE-5970 Improve the AssignmentManager#updateTimer and speed up handling opened event to 0.94 (Sergey Shelukhin) (Revision 1403787) Result = FAILURE tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          Hide
          chunhui shen added a comment -

          yangming
          Please see the HBASE-7018, you know why your cluster handle opened event slow.

          In my testing, the cluster doesn't have the problem by HBASE-7018, so its bottleneck is on the master.

          Also, please increase the open region threads on the regionserver.

          If you also have problem, you could email me directly...

          Show
          chunhui shen added a comment - yangming Please see the HBASE-7018 , you know why your cluster handle opened event slow. In my testing, the cluster doesn't have the problem by HBASE-7018 , so its bottleneck is on the master. Also, please increase the open region threads on the regionserver. If you also have problem, you could email me directly...
          Hide
          yang ming added a comment -

          chunhui shen
          ok,thanks very much!
          I will try!

          Show
          yang ming added a comment - chunhui shen ok,thanks very much! I will try!
          stack made changes -
          Fix Version/s 0.95.0 [ 12324094 ]
          Fix Version/s 0.96.0 [ 12320040 ]
          stack made changes -
          Fix Version/s 0.98.0 [ 12323143 ]
          stack made changes -
          Fix Version/s 0.98.0 [ 12323143 ]
          Hide
          stack added a comment -

          Marking closed.

          Show
          stack added a comment - Marking closed.
          stack made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              chunhui shen
              Reporter:
              chunhui shen
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development