HBase
  1. HBase
  2. HBASE-5140

TableInputFormat subclass to allow N number of splits per region during MR jobs

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: 0.90.4
    • Fix Version/s: None
    • Component/s: mapreduce
    • Labels:
    • Release Note:
      Used the 0.90 branch for the patch but code looks compatible in trunk as well (with one deprecated method)
    • Tags:
      mapreduce splits tableinputformat

      Description

      In regards to HBASE-5138 I am working on a patch for the TableInputFormat class that overrides getSplits in order to generate N number of splits per regions and/or N number of splits per job. The idea is to convert the startKey and endKey for each region from byte[] to BigDecimal, take the difference, divide by N, convert back to byte[] and generate splits on the resulting values. Assuming your keys are fully distributed this should generate splits at nearly the same number of rows per split. Any suggestions on this issue are welcome.

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          Why is this deemed irrelevant?

          No activity since March 2012.

          Show
          Andrew Purtell added a comment - Why is this deemed irrelevant? No activity since March 2012.
          Hide
          David Koch added a comment -

          Stale issue. Reopen if still relevant.

          Why is this deemed irrelevant? Is there new functionality in recent HBase versions which supersedes this class? By the way, in method getMaxByteArrayValue the array value assignment should read:

          bytes[i] = (byte) 0xff;
          
          Show
          David Koch added a comment - Stale issue. Reopen if still relevant. Why is this deemed irrelevant? Is there new functionality in recent HBase versions which supersedes this class? By the way, in method getMaxByteArrayValue the array value assignment should read: bytes[i] = ( byte ) 0xff;
          Hide
          Andrew Purtell added a comment -

          Stale issue. Reopen if still relevant.

          Show
          Andrew Purtell added a comment - Stale issue. Reopen if still relevant.
          Hide
          Rajesh Balamohan added a comment -

          @Josh - Thanks for this patch.

          for loop within getSplits() generates the splits with the help of generateRegionSplits(). However, the returned list<InputSplit> is not added back to "List<InputSplit> splits = new ArrayList<InputSplit>(keys.getFirst().length);"

          Show
          Rajesh Balamohan added a comment - @Josh - Thanks for this patch. for loop within getSplits() generates the splits with the help of generateRegionSplits(). However, the returned list<InputSplit> is not added back to "List<InputSplit> splits = new ArrayList<InputSplit>(keys.getFirst().length);"
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510012/Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch.1
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -151 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/713//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/713//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/713//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510012/Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch.1 against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -151 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/713//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/713//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/713//console This message is automatically generated.
          Hide
          Josh Wymer added a comment -

          Added check for null result in scan for first row and closed the scanner in a finally.

          Show
          Josh Wymer added a comment - Added check for null result in scan for first row and closed the scanner in a finally.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510006/Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -151 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapreduce.TestTableMapReduce

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/712//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/712//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/712//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510006/Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -151 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/712//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/712//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/712//console This message is automatically generated.
          Hide
          Josh Wymer added a comment -

          Another try at the patch with a unit test included and a method refactored. Used hbase trunk to build this patch.

          Show
          Josh Wymer added a comment - Another try at the patch with a unit test included and a method refactored. Used hbase trunk to build this patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12509974/Added_functionality_to_split_n_times_per_region_on_mapreduce_jobs.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/711//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509974/Added_functionality_to_split_n_times_per_region_on_mapreduce_jobs.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/711//console This message is automatically generated.
          Hide
          Josh Wymer added a comment -

          This change introduces two new properties: hbase.mapreduce.splitsPerRegion and hbase.mapreduce.splitKeyBytePrecision.

          Setting hbase.mapreduce.splitsPerRegion to anything > 1 will result in each region being split into that number of splits. If nothing is passed or 1 is passed, the TableInputFormat will execute as usual (one split per region).

          The splitKeyBytePrecision determines the byte length (64 by default) to use when generating a max value byte array in the case that the region's end key is of zero length (e.g. the region that contains the last row). This is required to try and "guess" at split distributions for that region. If keys are fully distributed, this should result in fairly equal splits.

          The Bytes.split utility is used to split the range between the start and end keys n number of times.

          Show
          Josh Wymer added a comment - This change introduces two new properties: hbase.mapreduce.splitsPerRegion and hbase.mapreduce.splitKeyBytePrecision. Setting hbase.mapreduce.splitsPerRegion to anything > 1 will result in each region being split into that number of splits. If nothing is passed or 1 is passed, the TableInputFormat will execute as usual (one split per region). The splitKeyBytePrecision determines the byte length (64 by default) to use when generating a max value byte array in the case that the region's end key is of zero length (e.g. the region that contains the last row). This is required to try and "guess" at split distributions for that region. If keys are fully distributed, this should result in fairly equal splits. The Bytes.split utility is used to split the range between the start and end keys n number of times.
          Hide
          Jean-Daniel Cryans added a comment -

          The best practice is to presplit when creating the table.

          I think this jira is valid for cases where the regions are so big (GBs) that one would benefit from having multiple mappers per region.

          Show
          Jean-Daniel Cryans added a comment - The best practice is to presplit when creating the table. I think this jira is valid for cases where the regions are so big (GBs) that one would benefit from having multiple mappers per region.
          Hide
          Ted Yu added a comment -

          MAPREDUCE-1220, referenced in HBASE-4063, has been resolved against hadoop 0.23.
          So we cannot use it at the moment.

          @Josh:
          I believe the single region scenario is the degenerate case.
          Using max value for long should be fine for that case.
          The best practice is to presplit when creating the table.

          Show
          Ted Yu added a comment - MAPREDUCE-1220 , referenced in HBASE-4063 , has been resolved against hadoop 0.23. So we cannot use it at the moment. @Josh: I believe the single region scenario is the degenerate case. Using max value for long should be fine for that case. The best practice is to presplit when creating the table.
          Hide
          Ming Ma added a comment -

          Is it the same as HBASE-4063?

          Show
          Ming Ma added a comment - Is it the same as HBASE-4063 ?
          Hide
          Josh Wymer added a comment -

          Correct but for example on a table with one region, getStartEndKeys() returns two empty byte[]. The last region (or only region) for the table will return empty byte[] as the end key allowing the scan to scan to the end of the table. Therefore, we don't know the upper bound byte[] to use in order to determine the long (or int, etc) value we want to use for split calculations. So we must either have an efficient way to get the last key in this case or arbitrarily set the long to it's max value (since in any case nothing could be higher) and use that number to make the calculations. This obviously won't work for unbound data types like BigDecimal and is a partial solution at best.

          Show
          Josh Wymer added a comment - Correct but for example on a table with one region, getStartEndKeys() returns two empty byte[]. The last region (or only region) for the table will return empty byte[] as the end key allowing the scan to scan to the end of the table. Therefore, we don't know the upper bound byte[] to use in order to determine the long (or int, etc) value we want to use for split calculations. So we must either have an efficient way to get the last key in this case or arbitrarily set the long to it's max value (since in any case nothing could be higher) and use that number to make the calculations. This obviously won't work for unbound data types like BigDecimal and is a partial solution at best.
          Hide
          Ted Yu added a comment - - edited

          I suggest utilizing this method in HTable:

            public Pair<byte[][],byte[][]> getStartEndKeys() throws IOException {
          

          i.e. start and end keys returned by the above method are passed to the splitter interface.

          Show
          Ted Yu added a comment - - edited I suggest utilizing this method in HTable: public Pair< byte [][], byte [][]> getStartEndKeys() throws IOException { i.e. start and end keys returned by the above method are passed to the splitter interface.
          Hide
          Josh Wymer added a comment -

          One glaring issue is the lack of start & end keys for one region tables. To get the start key we could do a quick scan of the first row and get the key. For the last region of a table, I'm not sure how we'll handle determining the end key other than setting it to the max size of whatever data type (e.g. long) we are using for the split calculations. Any suggestions other than this?

          Show
          Josh Wymer added a comment - One glaring issue is the lack of start & end keys for one region tables. To get the start key we could do a quick scan of the first row and get the key. For the last region of a table, I'm not sure how we'll handle determining the end key other than setting it to the max size of whatever data type (e.g. long) we are using for the split calculations. Any suggestions other than this?
          Hide
          Josh Wymer added a comment -

          We also talked about other methods such as using the first 8 bytes of the keys and converting to a long. This could indeed be solved by an interface.

          Show
          Josh Wymer added a comment - We also talked about other methods such as using the first 8 bytes of the keys and converting to a long. This could indeed be solved by an interface.
          Hide
          Ted Yu added a comment -

          We should consider the amount of computing involved in the map/reduce tasks.
          The assumption expressed in the description may not be satisfied in various scenarios.

          I think we can provide abstraction over key space partitioning by introducing an interface.
          The BigDecimal idea would be one implementation.

          Show
          Ted Yu added a comment - We should consider the amount of computing involved in the map/reduce tasks. The assumption expressed in the description may not be satisfied in various scenarios. I think we can provide abstraction over key space partitioning by introducing an interface. The BigDecimal idea would be one implementation.

            People

            • Assignee:
              Unassigned
              Reporter:
              Josh Wymer
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development