Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      It may sometimes be advantageous to prevent the number of regions from growing very large. This may happen if the values are large in size even though the number of keyvalues are not large. If the number of regions becomes too large, then it is difficult to accommodate the memstore for each region in memory. In such cases, we either have to flush out memstore to disk or decrease size of each memstore.

        Activity

        Hide
        HBase Review Board added a comment -

        Message from: "Pranav Khaitan" <pranavkhaitan@facebook.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/334/
        -----------------------------------------------------------

        Review request for hbase, Jonathan Gray, Karthik Ranganathan, and Kannan Muthukkaruppan.

        Summary
        -------

        This patch is for capping the number of regions on a region server depending on different use-cases. We add a parameter in the config file which can be used by the user to set this value. The default value is INT_MAX.

        This addresses bug HBASE-2844.
        http://issues.apache.org/jira/browse/HBASE-2844

        Diffs


        trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 964968
        trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 964968
        trunk/src/main/resources/hbase-default.xml 964968
        trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java PRE-CREATION

        Diff: http://review.hbase.org/r/334/diff

        Testing
        -------

        Existing tests passing. Written more specialized tests at HRegionServer level

        Thanks,

        Pranav

        Show
        HBase Review Board added a comment - Message from: "Pranav Khaitan" <pranavkhaitan@facebook.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/334/ ----------------------------------------------------------- Review request for hbase, Jonathan Gray, Karthik Ranganathan, and Kannan Muthukkaruppan. Summary ------- This patch is for capping the number of regions on a region server depending on different use-cases. We add a parameter in the config file which can be used by the user to set this value. The default value is INT_MAX. This addresses bug HBASE-2844 . http://issues.apache.org/jira/browse/HBASE-2844 Diffs trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 964968 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 964968 trunk/src/main/resources/hbase-default.xml 964968 trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java PRE-CREATION Diff: http://review.hbase.org/r/334/diff Testing ------- Existing tests passing. Written more specialized tests at HRegionServer level Thanks, Pranav
        Hide
        HBase Review Board added a comment -

        Message from: stack@duboce.net

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/334/#review678
        -----------------------------------------------------------

        trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java
        <http://review.cloudera.org/r/334/#comment2466>

        Should we even include this test Pranav? I don't see where you are proving that we do not exceed the limit or that the limit works. Also, where are you making the splits in this test?

        I'm fine committing the patch without this test – the code as is looks fine to me.

        • stack
        Show
        HBase Review Board added a comment - Message from: stack@duboce.net ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/334/#review678 ----------------------------------------------------------- trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java < http://review.cloudera.org/r/334/#comment2466 > Should we even include this test Pranav? I don't see where you are proving that we do not exceed the limit or that the limit works. Also, where are you making the splits in this test? I'm fine committing the patch without this test – the code as is looks fine to me. stack
        Hide
        HBase Review Board added a comment -

        Message from: "Pranav Khaitan" <pranavkhaitan@facebook.com>

        On 2010-08-04 21:43:56, stack wrote:

        > trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java, line 49

        > <http://review.cloudera.org/r/334/diff/2/?file=2893#file2893line49>

        >

        > Should we even include this test Pranav? I don't see where you are proving that we do not exceed the limit or that the limit works. Also, where are you making the splits in this test?

        >

        > I'm fine committing the patch without this test – the code as is looks fine to me.

        Stack,

        In the test, splits are being made by inserting lot of data and checking
        frequently for splits. However, it is a time consuming test. I completely
        agree with you that the tests for this are not really necessarily and they
        require a lot of testing time. I was under the impression that any change
        requires a corresponding test and therefore added it.

        • Pranav

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/334/#review678
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Pranav Khaitan" <pranavkhaitan@facebook.com> On 2010-08-04 21:43:56, stack wrote: > trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java, line 49 > < http://review.cloudera.org/r/334/diff/2/?file=2893#file2893line49 > > > Should we even include this test Pranav? I don't see where you are proving that we do not exceed the limit or that the limit works. Also, where are you making the splits in this test? > > I'm fine committing the patch without this test – the code as is looks fine to me. Stack, In the test, splits are being made by inserting lot of data and checking frequently for splits. However, it is a time consuming test. I completely agree with you that the tests for this are not really necessarily and they require a lot of testing time. I was under the impression that any change requires a corresponding test and therefore added it. Pranav ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/334/#review678 -----------------------------------------------------------
        Hide
        stack added a comment -

        Committed. Thanks for the patch Pranav.

        Show
        stack added a comment - Committed. Thanks for the patch Pranav.
        Hide
        stack added a comment -

        Patch applied (minus the indefinite test).

        Show
        stack added a comment - Patch applied (minus the indefinite test).
        Hide
        HBase Review Board added a comment -

        Message from: stack@duboce.net

        On 2010-08-04 21:43:56, stack wrote:

        > trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java, line 49

        > <http://review.cloudera.org/r/334/diff/2/?file=2893#file2893line49>

        >

        > Should we even include this test Pranav? I don't see where you are proving that we do not exceed the limit or that the limit works. Also, where are you making the splits in this test?

        >

        > I'm fine committing the patch without this test – the code as is looks fine to me.

        Pranav Khaitan wrote:

        Stack,

        In the test, splits are being made by inserting lot of data and checking

        frequently for splits. However, it is a time consuming test. I completely

        agree with you that the tests for this are not really necessarily and they

        require a lot of testing time. I was under the impression that any change

        requires a corresponding test and therefore added it.

        Tests are required unless the test does more harm than good (smile). Let me commit w/o this particular test. Thanks Pranav.

        • stack

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/334/#review678
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: stack@duboce.net On 2010-08-04 21:43:56, stack wrote: > trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServer.java, line 49 > < http://review.cloudera.org/r/334/diff/2/?file=2893#file2893line49 > > > Should we even include this test Pranav? I don't see where you are proving that we do not exceed the limit or that the limit works. Also, where are you making the splits in this test? > > I'm fine committing the patch without this test – the code as is looks fine to me. Pranav Khaitan wrote: Stack, In the test, splits are being made by inserting lot of data and checking frequently for splits. However, it is a time consuming test. I completely agree with you that the tests for this are not really necessarily and they require a lot of testing time. I was under the impression that any change requires a corresponding test and therefore added it. Tests are required unless the test does more harm than good (smile). Let me commit w/o this particular test. Thanks Pranav. stack ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/334/#review678 -----------------------------------------------------------
        Hide
        Andrew Purtell added a comment -

        Configuration parameters such as this should be able to be applied per table via a table attribute, like how we do it with split file size threshold.

        Show
        Andrew Purtell added a comment - Configuration parameters such as this should be able to be applied per table via a table attribute, like how we do it with split file size threshold.
        Hide
        stack added a comment -

        Do you think this parameter makes sense on a per-table basis Andrew? The whole idea is to limit regions across the cluster letting them grow rather than split once we hit the limit.

        Show
        stack added a comment - Do you think this parameter makes sense on a per-table basis Andrew? The whole idea is to limit regions across the cluster letting them grow rather than split once we hit the limit.
        Hide
        stack added a comment -

        Reclosing. Opened HBASE-2956 to do what Andrew suggests, a cap on regions per table.

        Show
        stack added a comment - Reclosing. Opened HBASE-2956 to do what Andrew suggests, a cap on regions per table.

          People

          • Assignee:
            Pranav Khaitan
            Reporter:
            Pranav Khaitan
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development