HBase
  1. HBase
  2. HBASE-5915

Improve performance of KeyValue comparison

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.94.1
    • Fix Version/s: None
    • Component/s: Performance
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Got bored tonight and poked around at making KeyValue comparison faster. I was able to get a ~23% speedup on a micro-benchmark I wrote by using Unsafe in a few more places. This should make some difference for real workloads, since this is usually one of the top CPU consumers.

      1. hbase-5915.txt
        13 kB
        Todd Lipcon
      2. 5915-v2.txt
        12 kB
        Ted Yu

        Issue Links

          Activity

          Hide
          Todd Lipcon added a comment -

          Here's the patch and micro-benchmark.

          Would be curious if people can poke holes in it (or show with a more realistic workload if it makes a difference)

          Show
          Todd Lipcon added a comment - Here's the patch and micro-benchmark. Would be curious if people can poke holes in it (or show with a more realistic workload if it makes a difference)
          Hide
          Ted Yu added a comment -

          The following variable doesn't seem to be used:

          +        int triggerBoundsCheck = buf[off + SIZEOF_LONG - 1];
          

          What's its purpose ?

          Show
          Ted Yu added a comment - The following variable doesn't seem to be used: + int triggerBoundsCheck = buf[off + SIZEOF_LONG - 1]; What's its purpose ?
          Hide
          stack added a comment -

          It looks great Todd.

          This is funny (when wrong-endian) --> + return Long.reverseBytes(result);

          Nice --> + final int minWords = minLength >>> 3; // / SIZEOF_LONG;.... that should be faster.

          Show
          stack added a comment - It looks great Todd. This is funny (when wrong-endian) --> + return Long.reverseBytes(result); Nice --> + final int minWords = minLength >>> 3; // / SIZEOF_LONG;.... that should be faster.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12525264/hbase-5915.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.TestCheckTestClasses

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1726//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1726//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1726//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525264/hbase-5915.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestCheckTestClasses Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1726//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1726//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1726//console This message is automatically generated.
          Hide
          ramkrishna.s.vasudevan added a comment -
          +              return lessThanUnsigned(Long.reverseBytes(lw),
          +                  Long.reverseBytes(rw)) ? -1 : 1;
          

          Patch v2 is different from V1. Was it intended Ted?

          Show
          ramkrishna.s.vasudevan added a comment - + return lessThanUnsigned( Long .reverseBytes(lw), + Long .reverseBytes(rw)) ? -1 : 1; Patch v2 is different from V1. Was it intended Ted?
          Hide
          Ted Yu added a comment -

          TestKVComparisonPerformance without the patch:

          Run 99 took 99574us
          
          Avg: 98655us
          

          Here is the output with the patch:

          Run 99 took 79849us
          
          Avg: 79295us
          
          Show
          Ted Yu added a comment - TestKVComparisonPerformance without the patch: Run 99 took 99574us Avg: 98655us Here is the output with the patch: Run 99 took 79849us Avg: 79295us
          Hide
          Ted Yu added a comment -

          @Ram:
          The quoted snippet came from Todd.
          Can you elaborate what problem you saw ?

          Show
          Ted Yu added a comment - @Ram: The quoted snippet came from Todd. Can you elaborate what problem you saw ?
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Ted
          Sorry Ted. I think its fine. Once i applied the patch it looked fine. Sorry for the confusion.

          Show
          ramkrishna.s.vasudevan added a comment - @Ted Sorry Ted. I think its fine. Once i applied the patch it looked fine. Sorry for the confusion.
          Hide
          Todd Lipcon added a comment -

          Before we commit this, we should run some actual workloads. There are two funky things I noticed about the patch:
          1) the test case always has the behavior that the first KV to be compared is in cache while the second one is not, which is not quite realistic in a real workload. Maybe it's better to time insertion of a few million KVs into a KeyValueSkipListSet instead.
          2) when I tried this patch with a YCSB workload, it actually took more CPU seconds than without the patch, which is odd given the microbenchmark. So something isn't aligning with reality. I didn't have time to investigate that.

          Show
          Todd Lipcon added a comment - Before we commit this, we should run some actual workloads. There are two funky things I noticed about the patch: 1) the test case always has the behavior that the first KV to be compared is in cache while the second one is not, which is not quite realistic in a real workload. Maybe it's better to time insertion of a few million KVs into a KeyValueSkipListSet instead. 2) when I tried this patch with a YCSB workload, it actually took more CPU seconds than without the patch, which is odd given the microbenchmark. So something isn't aligning with reality. I didn't have time to investigate that.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12525313/5915-v2.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
          org.apache.hadoop.hbase.regionserver.TestSplitLogWorker

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1728//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1728//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1728//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525313/5915-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.regionserver.TestSplitLogWorker Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1728//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1728//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1728//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Patch looks good. Changing minWords * SIZEOF_LONG to minWords << 3, makes it less readable, and probably has no discernible difference (the compiler should be doing something like this anyway - if SIZEOF_LONG is final).

          "LexicographicalComparerHolder" is not the right name anymore.

          I'll try with some test workload next week. Linux is littleEndian, right? So the bytes would always need to be reversed. Maybe that explains that it is sometimes slower?

          Show
          Lars Hofhansl added a comment - Patch looks good. Changing minWords * SIZEOF_LONG to minWords << 3, makes it less readable, and probably has no discernible difference (the compiler should be doing something like this anyway - if SIZEOF_LONG is final). "LexicographicalComparerHolder" is not the right name anymore. I'll try with some test workload next week. Linux is littleEndian, right? So the bytes would always need to be reversed. Maybe that explains that it is sometimes slower?
          Hide
          Lars Hofhansl added a comment -

          I meant X86 is little endian (not Linux of course)

          Show
          Lars Hofhansl added a comment - I meant X86 is little endian (not Linux of course)

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:

                Development