HBase
  1. HBase
  2. HBASE-9931

Optional setBatch for CopyTable to copy large rows in batches

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.96.1, 0.94.15
    • Component/s: mapreduce
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We've had CopyTable jobs fail because a small number of rows are wide enough to not fit into memory. If we could specify the batch size for CopyTable scans that shoud be able to break those large rows up into multiple iterations to save the heap.

      1. HBASE-9931.01.patch
        1 kB
        Nick Dimiduk
      2. HBASE-9931.00.patch
        1 kB
        Nick Dimiduk

        Activity

        Hide
        Nick Dimiduk added a comment -

        You know for sure the issue is in the cans? Are you sure this isn't a dupe of HBASE-7743?

        Show
        Nick Dimiduk added a comment - You know for sure the issue is in the cans? Are you sure this isn't a dupe of HBASE-7743 ?
        Hide
        Dave Latham added a comment -

        Yeah - it's in the scans. CopyTable doesn't use reducers at all.

        Show
        Dave Latham added a comment - Yeah - it's in the scans. CopyTable doesn't use reducers at all.
        Hide
        stack added a comment -

        Is this just a copytable config then Dave Latham?

        Show
        stack added a comment - Is this just a copytable config then Dave Latham ?
        Hide
        Dave Latham added a comment -

        I think a new config option to call Scan.setBatch would do it. I'm always a bit puzzled about how scanner batching and caching interact with mixed length rows, but I imagine it would work out ok. (4 years later, and still using a version where HBASE-1996 didn't make it.)

        Show
        Dave Latham added a comment - I think a new config option to call Scan.setBatch would do it. I'm always a bit puzzled about how scanner batching and caching interact with mixed length rows, but I imagine it would work out ok. (4 years later, and still using a version where HBASE-1996 didn't make it.)
        Hide
        Nick Dimiduk added a comment -

        I see TableInputFormat already supports overriding many scanner properties via config. Would it be sufficient to add another configuration point, hbase.mapreduce.scan.batchsize perhaps?

        Show
        Nick Dimiduk added a comment - I see TableInputFormat already supports overriding many scanner properties via config. Would it be sufficient to add another configuration point, hbase.mapreduce.scan.batchsize perhaps?
        Hide
        Nick Dimiduk added a comment -

        Dave Latham will this patch work for you?

        Show
        Nick Dimiduk added a comment - Dave Latham will this patch work for you?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12615157/HBASE-9931.00.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        -1 site. The patch appears to cause mvn site goal to fail.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615157/HBASE-9931.00.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7964//console This message is automatically generated.
        Hide
        Dave Latham added a comment -

        The patch adds setBatch and removes setCaching. I think setCaching should stay in there too - probably a copy/replace bug?

        Show
        Dave Latham added a comment - The patch adds setBatch and removes setCaching. I think setCaching should stay in there too - probably a copy/replace bug?
        Hide
        Nick Dimiduk added a comment -

        Yes, you're right. Take 2.

        Show
        Nick Dimiduk added a comment - Yes, you're right. Take 2.
        Hide
        Dave Latham added a comment -

        Looks good, Nick. +1

        Would love to see it hit 0.96 and 0.94 as well.

        Show
        Dave Latham added a comment - Looks good, Nick. +1 Would love to see it hit 0.96 and 0.94 as well.
        Hide
        Nick Dimiduk added a comment -

        Lars Hofhansl, stack, Andrew Purtell Any objections on this one?

        Show
        Nick Dimiduk added a comment - Lars Hofhansl , stack , Andrew Purtell Any objections on this one?
        Hide
        Andrew Purtell added a comment -

        +1

        Show
        Andrew Purtell added a comment - +1
        Hide
        Lars Hofhansl added a comment -

        +1

        Show
        Lars Hofhansl added a comment - +1
        Hide
        Nick Dimiduk added a comment -

        Patch applied to trunk, 0.96, 0.94 branches. Thanks for the report, Dave; the reviews Andrew and Lars.

        Show
        Nick Dimiduk added a comment - Patch applied to trunk, 0.96, 0.94 branches. Thanks for the report, Dave; the reviews Andrew and Lars.
        Hide
        stack added a comment -

        +1

        Thanks for adding to 0.96.

        Show
        stack added a comment - +1 Thanks for adding to 0.96.
        Hide
        stack added a comment -

        Released in 0.96.1. Issue closed.

        Show
        stack added a comment - Released in 0.96.1. Issue closed.

          People

          • Assignee:
            Nick Dimiduk
            Reporter:
            Dave Latham
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development