Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-9807

block encoder unnecessarily copies the key for each reseek

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.94.13, 0.96.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In HFileReaderV2.AbstractScannerV2.reseekTo(...) we have this:

              ByteBuffer bb = getKey();
              compared = reader.getComparator().compare(key, offset,
                  length, bb.array(), bb.arrayOffset(), bb.limit());
      

      getKey() creates two ByteBuffers in ScannerV2 and makes a deep copy of the key in EncodedScannerV2.

      1. 9807-trunk-v1.txt
        5 kB
        Lars Hofhansl
      2. 9807-0.94-v3.txt
        4 kB
        Lars Hofhansl
      3. 9807-0.94-v2.txt
        4 kB
        Lars Hofhansl
      4. 9807-0.94.txt
        6 kB
        Lars Hofhansl

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in HBase-0.94-security #326 (See https://builds.apache.org/job/HBase-0.94-security/326/)
        HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535781)

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in HBase-0.94-security #326 (See https://builds.apache.org/job/HBase-0.94-security/326/ ) HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535781) /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #811 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/811/)
        HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535779)

        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/trunk/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #811 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/811/ ) HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535779) /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/trunk/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96-hadoop2 #102 (See https://builds.apache.org/job/hbase-0.96-hadoop2/102/)
        HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535782)

        • /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/branches/0.96/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java
        • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in hbase-0.96-hadoop2 #102 (See https://builds.apache.org/job/hbase-0.96-hadoop2/102/ ) HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535782) /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/branches/0.96/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in HBase-TRUNK #4647 (See https://builds.apache.org/job/HBase-TRUNK/4647/)
        HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535779)

        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/trunk/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4647 (See https://builds.apache.org/job/HBase-TRUNK/4647/ ) HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535779) /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/trunk/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in hbase-0.96 #161 (See https://builds.apache.org/job/hbase-0.96/161/)
        HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535782)

        • /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/branches/0.96/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java
        • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in hbase-0.96 #161 (See https://builds.apache.org/job/hbase-0.96/161/ ) HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535782) /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/branches/0.96/hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in HBase-0.94 #1184 (See https://builds.apache.org/job/HBase-0.94/1184/)
        HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535781)

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in HBase-0.94 #1184 (See https://builds.apache.org/job/HBase-0.94/1184/ ) HBASE-9807 block encoder unnecessarily copies the key for each reseek (larsh: rev 1535781) /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        Hide
        mcorgan Matt Corgan added a comment -

        Sorry i didn't have any comments there. For prefix-tree to benefit, we have to use the CellComparator, and pass Cell references up through these interfaces.

        Show
        mcorgan Matt Corgan added a comment - Sorry i didn't have any comments there. For prefix-tree to benefit, we have to use the CellComparator, and pass Cell references up through these interfaces.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Committed to all branches. Thanks for taking a look, Stack.

        Show
        lhofhansl Lars Hofhansl added a comment - Committed to all branches. Thanks for taking a look, Stack.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Nope. Does not fail with the patch. Also saw this over in HBASE-9272.

        Will commit tomorrow unless I head objections.

        Show
        lhofhansl Lars Hofhansl added a comment - Nope. Does not fail with the patch. Also saw this over in HBASE-9272 . Will commit tomorrow unless I head objections.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Yeah. Looks suspicious, will have a look.

        Show
        lhofhansl Lars Hofhansl added a comment - Yeah. Looks suspicious, will have a look.
        Hide
        stack stack added a comment -

        lgtm

        The test failure yours?

        Show
        stack stack added a comment - lgtm The test failure yours?
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12609340/9807-trunk-v1.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        -1 site. The patch appears to cause mvn site goal to fail.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.coprocessor.TestRegionObserverScannerOpenHook

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12609340/9807-trunk-v1.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestRegionObserverScannerOpenHook Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7590//console This message is automatically generated.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Matt Corgan, wanna have a look? Can you think of a way of avoiding this copy? I guess we could partial compares, since we know the scanner won't progress throughout this call.

        Show
        lhofhansl Lars Hofhansl added a comment - Matt Corgan , wanna have a look? Can you think of a way of avoiding this copy? I guess we could partial compares, since we know the scanner won't progress throughout this call.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        And a patch for trunk. Note that prefix tree encoding cannot be optimized this way. So this patch won't help there.

        Show
        lhofhansl Lars Hofhansl added a comment - And a patch for trunk. Note that prefix tree encoding cannot be optimized this way. So this patch won't help there.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Somehow throws IOException sneaked in there. Not needed.

        Show
        lhofhansl Lars Hofhansl added a comment - Somehow throws IOException sneaked in there. Not needed.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Simplified patch. This is only needed in AbstractScannerV2 hierarchy.

        Show
        lhofhansl Lars Hofhansl added a comment - Simplified patch. This is only needed in AbstractScannerV2 hierarchy.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        This is just the default implementation, overridden in ScannerV2 and EncodedScannerV2. I don't think I even need this. This won't be called on ScannerV1 (in 0.94) and could just throw UnsupportedOperationException.

        Show
        lhofhansl Lars Hofhansl added a comment - This is just the default implementation, overridden in ScannerV2 and EncodedScannerV2. I don't think I even need this. This won't be called on ScannerV1 (in 0.94) and could just throw UnsupportedOperationException.
        Hide
        stack stack added a comment -

        You have this still

        + ByteBuffer bb = getKey();
        + return reader.getComparator().compare(key, offset,
        + length, bb.array(), bb.arrayOffset(), bb.limit());

        Does that mean we are still doing deep copy, what you would avoid?

        Good on you Lars.

        Show
        stack stack added a comment - You have this still + ByteBuffer bb = getKey(); + return reader.getComparator().compare(key, offset, + length, bb.array(), bb.arrayOffset(), bb.limit()); Does that mean we are still doing deep copy, what you would avoid? Good on you Lars.
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Should be a boon for Phoenix, which typically has long row keys and uses block encoding by default (James Taylor).

        Show
        lhofhansl Lars Hofhansl added a comment - Should be a boon for Phoenix, which typically has long row keys and uses block encoding by default ( James Taylor ).
        Hide
        lhofhansl Lars Hofhansl added a comment -

        Proposed patch. Pushes the compare into the scanner/seeker.

        • in ScannerV2 that removes creation of the two ByteBuffer objects per reseek. No real performance gain from that.
        • in EncodedScannerV2 this avoid copying the entire key, just to compare it. For 8 byte row keys and 10 bytes values this lead to a 10% performance gain in scanning.
        Show
        lhofhansl Lars Hofhansl added a comment - Proposed patch. Pushes the compare into the scanner/seeker. in ScannerV2 that removes creation of the two ByteBuffer objects per reseek. No real performance gain from that. in EncodedScannerV2 this avoid copying the entire key, just to compare it. For 8 byte row keys and 10 bytes values this lead to a 10% performance gain in scanning.

          People

          • Assignee:
            lhofhansl Lars Hofhansl
            Reporter:
            lhofhansl Lars Hofhansl
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development