HBase
  1. HBase
  2. HBASE-10049

Small improvments in region_mover.rb

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.94.15, 0.96.2
    • Component/s: None
    • Labels:
      None

      Description

      We use region_mover.rb in the graceful upgrade of hbase cluster.

      Here are small improvements.

      a. remove the table.close(), because the htable could be reused.
      b. Add more info in the log of moving region.
      c. Add 20s sleep in load command to make sure the rs finished initialization of rpc server. There is a time gap between rs startup report and rpc server initialization.

      1. HBASE-10049-0.94-v1.diff
        3 kB
        Liu Shaohui
      2. HBASE-10049-0.94-v2.diff
        3 kB
        Liu Shaohui
      3. HBASE-10049-trunk-v1.diff
        3 kB
        Liu Shaohui

        Activity

        Hide
        Liu Shaohui added a comment -

        Patch for 0.94

        Show
        Liu Shaohui added a comment - Patch for 0.94
        Hide
        Ted Yu added a comment -

        + # Do not close the htable. It is cached in $TABLES and

        Should the tables in $TABLES be closed at the end of the region movement ?

        Show
        Ted Yu added a comment - + # Do not close the htable. It is cached in $TABLES and Should the tables in $TABLES be closed at the end of the region movement ?
        Hide
        stack added a comment -

        +1 on patch. No harm in a close but this is a short-lived script so wouldn't kill myself ensuring it happens.

        Show
        stack added a comment - +1 on patch. No harm in a close but this is a short-lived script so wouldn't kill myself ensuring it happens.
        Hide
        Liu Shaohui added a comment -

        Ted Yu

        Updates:
        Close all tables at the end of region_mover.rb

        Show
        Liu Shaohui added a comment - Ted Yu Updates: Close all tables at the end of region_mover.rb
        Hide
        Liu Shaohui added a comment -

        Patch for 0.94

        Show
        Liu Shaohui added a comment - Patch for 0.94
        Hide
        Liu Shaohui added a comment -

        Patch for trunk

        Show
        Liu Shaohui added a comment - Patch for trunk
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12616215/HBASE-10049-trunk-v1.diff
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        -1 site. The patch appears to cause mvn site goal to fail.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616215/HBASE-10049-trunk-v1.diff against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8021//console This message is automatically generated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Patch looks good to me. +1.

        Show
        ramkrishna.s.vasudevan added a comment - Patch looks good to me. +1.
        Hide
        Ted Yu added a comment -

        Integrated to trunk.

        Thanks for the patch, Shaohui.

        Thanks for the review, Ram.

        Show
        Ted Yu added a comment - Integrated to trunk. Thanks for the patch, Shaohui. Thanks for the review, Ram.
        Hide
        stack added a comment -

        Please add to 0.96 and 0.94 branches also. This is a critical ops script.

        Show
        stack added a comment - Please add to 0.96 and 0.94 branches also. This is a critical ops script.
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-0.94-security #348 (See https://builds.apache.org/job/HBase-0.94-security/348/)
        HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546417)

        • /hbase/branches/0.94/bin/region_mover.rb
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-0.94-security #348 (See https://builds.apache.org/job/HBase-0.94-security/348/ ) HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546417) /hbase/branches/0.94/bin/region_mover.rb
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-TRUNK #4702 (See https://builds.apache.org/job/HBase-TRUNK/4702/)
        HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546393)

        • /hbase/trunk/bin/region_mover.rb
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4702 (See https://builds.apache.org/job/HBase-TRUNK/4702/ ) HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546393) /hbase/trunk/bin/region_mover.rb
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-0.94 #1214 (See https://builds.apache.org/job/HBase-0.94/1214/)
        HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546417)

        • /hbase/branches/0.94/bin/region_mover.rb
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-0.94 #1214 (See https://builds.apache.org/job/HBase-0.94/1214/ ) HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546417) /hbase/branches/0.94/bin/region_mover.rb
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96 #207 (See https://builds.apache.org/job/hbase-0.96/207/)
        HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546412)

        • /hbase/branches/0.96/bin/region_mover.rb
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.96 #207 (See https://builds.apache.org/job/hbase-0.96/207/ ) HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546412) /hbase/branches/0.96/bin/region_mover.rb
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96-hadoop2 #135 (See https://builds.apache.org/job/hbase-0.96-hadoop2/135/)
        HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546412)

        • /hbase/branches/0.96/bin/region_mover.rb
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.96-hadoop2 #135 (See https://builds.apache.org/job/hbase-0.96-hadoop2/135/ ) HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546412) /hbase/branches/0.96/bin/region_mover.rb
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #855 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/855/)
        HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546393)

        • /hbase/trunk/bin/region_mover.rb
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #855 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/855/ ) HBASE-10049 Small improvments in region_mover.rb (tedyu: rev 1546393) /hbase/trunk/bin/region_mover.rb
        Hide
        Andrew Purtell added a comment -

        Seems to already be committed, resolving issue.

        Show
        Andrew Purtell added a comment - Seems to already be committed, resolving issue.
        Hide
        Jean-Marc Spaggiari added a comment -

        Sorry to come late on this
        What is the reason behind the 20 seconds delay? Just before this delay there is a check to wait for the server to be online. So delay should not be required?

        Also, adding a log in the move method make the application VERY verbose, doubling the output. Is that really useful?

        I'm working on 8803 where I might have to remove some of those improvements to get the output readable, but before doing that I want to understand why it has been added so I can take that into consideration.

        Thanks.

        Show
        Jean-Marc Spaggiari added a comment - Sorry to come late on this What is the reason behind the 20 seconds delay? Just before this delay there is a check to wait for the server to be online. So delay should not be required? Also, adding a log in the move method make the application VERY verbose, doubling the output. Is that really useful? I'm working on 8803 where I might have to remove some of those improvements to get the output readable, but before doing that I want to understand why it has been added so I can take that into consideration. Thanks.
        Hide
        Jean-Marc Spaggiari added a comment -

        Ok. Just saw the comment for "c"

        20 seconds delay on a 300 nodes cluster is almost 2 hours... We should find another way to achieve the same goal.

        I will open a follow-up JIRA.

        Show
        Jean-Marc Spaggiari added a comment - Ok. Just saw the comment for "c" 20 seconds delay on a 300 nodes cluster is almost 2 hours... We should find another way to achieve the same goal. I will open a follow-up JIRA.
        Hide
        Liu Shaohui added a comment -

        Jean-Marc Spaggiari

        What is the reason behind the 20 seconds delay?

        There is a time gap between RS's startup report to HMaster and it's starting of service threads. And we found some exceptions in moving regions for RS have not finished to start it's service threads. So we add 20 seconds delay to make sure the RS have enough time to finish initialization.
        But the 20 may be not reasonable, especially for large clusters.

        Ps: For large clusters, we plan to dev a region_mover which can unload/load multi regionservers at the same time

        Adding a log in the move method make the application VERY verbose, doubling the output. Is that really useful?

        Yes. I think it's very useful to measure the maximum unavailable time for each region using region_mover.rb.
        And many other factors and configs will affect this time, eg: hbase.hstore.open.and.close.threads.max.
        According to this time. we can do more optimizations to reduce the unavailable time in gracefull upgrade.

        I don't know if the explanation is clear. More discussions are welcomed. Thanks.

        Show
        Liu Shaohui added a comment - Jean-Marc Spaggiari What is the reason behind the 20 seconds delay? There is a time gap between RS's startup report to HMaster and it's starting of service threads. And we found some exceptions in moving regions for RS have not finished to start it's service threads. So we add 20 seconds delay to make sure the RS have enough time to finish initialization. But the 20 may be not reasonable, especially for large clusters. Ps: For large clusters, we plan to dev a region_mover which can unload/load multi regionservers at the same time Adding a log in the move method make the application VERY verbose, doubling the output. Is that really useful? Yes. I think it's very useful to measure the maximum unavailable time for each region using region_mover.rb. And many other factors and configs will affect this time, eg: hbase.hstore.open.and.close.threads.max. According to this time. we can do more optimizations to reduce the unavailable time in gracefull upgrade. I don't know if the explanation is clear. More discussions are welcomed. Thanks.
        Hide
        Jean-Marc Spaggiari added a comment -

        Ps: For large clusters, we plan to dev a region_mover which can unload/load multi regionservers at the same time

        Take a look at HBASE-8803

        For the 20 seconds, I have opened HBASE-10202. I have few options in mind and will provide a patch as soon as HBASE-8803 is commited.

        For the login of "Moved ...." and "Moving ..." etc. I will see what we can do to reduce the logging and keep all the information. Here again, few ideas in mind. Will use HBAE-10202 too to submit them...

        Thanks for your feedback! Very appreciated.

        Show
        Jean-Marc Spaggiari added a comment - Ps: For large clusters, we plan to dev a region_mover which can unload/load multi regionservers at the same time Take a look at HBASE-8803 For the 20 seconds, I have opened HBASE-10202 . I have few options in mind and will provide a patch as soon as HBASE-8803 is commited. For the login of "Moved ...." and "Moving ..." etc. I will see what we can do to reduce the logging and keep all the information. Here again, few ideas in mind. Will use HBAE-10202 too to submit them... Thanks for your feedback! Very appreciated.

          People

          • Assignee:
            Liu Shaohui
            Reporter:
            Liu Shaohui
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development