HBase
  1. HBase
  2. HBASE-5625

Avoid byte buffer allocations when reading a value from a Result object

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.92.1
    • Fix Version/s: 0.95.0
    • Component/s: Client
    • Labels:
    • Hadoop Flags:
      Reviewed

      Description

      When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

      These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

      The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch.

      Running tests with a profiler, the reduction of read time seems to be of up to 40%.

      1. 5625.txt
        15 kB
        Tudor Scurtu
      2. 5625v2.txt
        15 kB
        Tudor Scurtu
      3. 5625v3.txt
        23 kB
        Tudor Scurtu
      4. 5625v4.txt
        25 kB
        Tudor Scurtu
      5. 5625v5.txt
        25 kB
        Tudor Scurtu
      6. 5625v6.txt
        34 kB
        Tudor Scurtu
      7. 5625v7.txt
        37 kB
        Tudor Scurtu
      8. 5625v8.txt
        38 kB
        Tudor Scurtu

        Activity

        Hide
        Tudor Scurtu added a comment -

        The current patch introduces a method (Result.loadValue()) that uses pre-allocated buffers. Momentarily the new methods duplicate code from the original ones so as not to impact the current architecture.

        Show
        Tudor Scurtu added a comment - The current patch introduces a method (Result.loadValue()) that uses pre-allocated buffers. Momentarily the new methods duplicate code from the original ones so as not to impact the current architecture.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12519598/5625.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1274//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519598/5625.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1274//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        @Tudor:
        Can you submit a patch for trunk?
        Thanks

        Show
        Ted Yu added a comment - @Tudor: Can you submit a patch for trunk? Thanks
        Hide
        Tudor Scurtu added a comment -

        Correct trunk diff.

        Show
        Tudor Scurtu added a comment - Correct trunk diff.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12519634/5625v2.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1275//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1275//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1275//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519634/5625v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1275//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1275//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1275//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        I don't see where loadValue() is called.

        Minor comments:

        +    long longkeylength = KEY_INFRASTRUCTURE_SIZE + rlength + flength + qlength;
        

        we normally use variable names such as longKeyLength where the second and subsequent words have leading capital letter.

        +    if(flength != 0) {
        ...
        +    if(qlength != 0) {
        

        Please insert space between if and (.

        +    KeyValue searchTerm = getSearchTerm(kvs[0].getRow(), family, foffset, flength, qualifier, qoffset, qlength);
        

        Recent vote is to limit ling length within 100 characters.

        +    if (lLength > Integer.MAX_VALUE)
        +      throw new IllegalArgumentException("KeyValue length " + lLength + " > " + Integer.MAX_VALUE);
        

        Please enclose the throw statement in curly braces.

        +   * The KeyValue for the most recent for a given column. If the column does
        +   * not exist in the result set - if it wasn't selected in the query (Get/Scan)
        +   * or just does not exist in the row the return value is null.
        

        I think the above can be phrased as:

        +   * The most recent KeyValue for a given column. If the column does
        +   * not exist in the result set - if it wasn't selected in the query (Get/Scan)
        +   * or just does not exist in the row - the return value is null.
        
        Show
        Ted Yu added a comment - I don't see where loadValue() is called. Minor comments: + long longkeylength = KEY_INFRASTRUCTURE_SIZE + rlength + flength + qlength; we normally use variable names such as longKeyLength where the second and subsequent words have leading capital letter. + if (flength != 0) { ... + if (qlength != 0) { Please insert space between if and (. + KeyValue searchTerm = getSearchTerm(kvs[0].getRow(), family, foffset, flength, qualifier, qoffset, qlength); Recent vote is to limit ling length within 100 characters. + if (lLength > Integer .MAX_VALUE) + throw new IllegalArgumentException( "KeyValue length " + lLength + " > " + Integer .MAX_VALUE); Please enclose the throw statement in curly braces. + * The KeyValue for the most recent for a given column. If the column does + * not exist in the result set - if it wasn't selected in the query (Get/Scan) + * or just does not exist in the row the return value is null . I think the above can be phrased as: + * The most recent KeyValue for a given column. If the column does + * not exist in the result set - if it wasn't selected in the query (Get/Scan) + * or just does not exist in the row - the return value is null .
        Hide
        stack added a comment -

        How do I know the provided buffer big enough?

        +  public KeyValue(byte [] buffer,
        

        What can we do to ensure that the checks in createByteArray and writeByteArray, your new method, are shared rather than duplicated?

        Put all of the below on one line or add curlies:

        +    if (!Bytes.equals(family, foffset, flength, this.bytes, o, fl))
        +      return false;
        

        Ditto for other similar formattings

        Do we need both containsNonEmptyColumn and containsEmptyColumn?

        Should loadValue be in KeyValue?

        I like createFirstOnRow as method name instead of getSearchTerm. The former describes what is happening?

        We should use your new binarySearch instead of what we have now?

        Thanks for looking into this stuff.

        What made you start digging here? You have big Result objects with lots of kvs?

        Show
        stack added a comment - How do I know the provided buffer big enough? + public KeyValue( byte [] buffer, What can we do to ensure that the checks in createByteArray and writeByteArray, your new method, are shared rather than duplicated? Put all of the below on one line or add curlies: + if (!Bytes.equals(family, foffset, flength, this .bytes, o, fl)) + return false ; Ditto for other similar formattings Do we need both containsNonEmptyColumn and containsEmptyColumn? Should loadValue be in KeyValue? I like createFirstOnRow as method name instead of getSearchTerm. The former describes what is happening? We should use your new binarySearch instead of what we have now? Thanks for looking into this stuff. What made you start digging here? You have big Result objects with lots of kvs?
        Hide
        Tudor Scurtu added a comment -

        @Zhihong:
        I initially submitted a minimalistic version so I wouldn't have to make extensive modifications after reviews. Now the method is called from a unit test class.

        The variable names were copied from existing methods and I wanted them to have a uniform naming scheme. I renamed the variables, but not the method parameters.

        The same for the method comment.

        Implemented code style comments.

        @stack:
        Implemented code style comments.

        Added check for buffer size.

        Created 'KeyValue.checkParameters()' method. Should 'createEmptyByteArray()' call it as well?

        'containsNonEmptyColumn()' checks if the value exists & is not empty; 'containsEmptyColumn()' checks if the value exists & is empty. If you would have only one, for the other case you would have to actually read the value and check it.

        Moved most 'loadValue()' functionality to 'KeyValue'. This raises a problem: how do we elegantly treat the case when the buffer (provided from 'Result') isn't big enough?

        Refactored 'Result.getSearchTerm()' as another 'KeyValue.createFirstOnRow()'.

        The new 'binarySearch()' method avoids allocating a byte array.

        We run incremental jobs that update values; we also have to read different values form the same row in different places.

        Show
        Tudor Scurtu added a comment - @Zhihong: I initially submitted a minimalistic version so I wouldn't have to make extensive modifications after reviews. Now the method is called from a unit test class. The variable names were copied from existing methods and I wanted them to have a uniform naming scheme. I renamed the variables, but not the method parameters. The same for the method comment. Implemented code style comments. @stack: Implemented code style comments. Added check for buffer size. Created 'KeyValue.checkParameters()' method. Should 'createEmptyByteArray()' call it as well? 'containsNonEmptyColumn()' checks if the value exists & is not empty; 'containsEmptyColumn()' checks if the value exists & is empty. If you would have only one, for the other case you would have to actually read the value and check it. Moved most 'loadValue()' functionality to 'KeyValue'. This raises a problem: how do we elegantly treat the case when the buffer (provided from 'Result') isn't big enough? Refactored 'Result.getSearchTerm()' as another 'KeyValue.createFirstOnRow()'. The new 'binarySearch()' method avoids allocating a byte array. We run incremental jobs that update values; we also have to read different values form the same row in different places.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520125/5625v3.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.client.TestFromClientSide
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1322//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1322//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1322//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520125/5625v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1322//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1322//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1322//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        @Tudor:
        Would it make sense to put your changes in testBasic() and testMultiVersion() into their own loop so that you can show us the performance improvement ?

        The call to flip() should be documented in non-test code.

        Show
        Ted Yu added a comment - @Tudor: Would it make sense to put your changes in testBasic() and testMultiVersion() into their own loop so that you can show us the performance improvement ? The call to flip() should be documented in non-test code.
        Hide
        Tudor Scurtu added a comment -

        @Zhihong:
        I added a new method, 'TestResult.testPerformance()'. This must be of course removed if the patch is accepted. It prints the results, so it doesn't require an integrated profiler. It takes a few minutes to run...

        Mentioned that the buffer isn't cleared or flipped in the 'KeyValue.loadValue()' method comment as well. Is that what you meant?

        Show
        Tudor Scurtu added a comment - @Zhihong: I added a new method, 'TestResult.testPerformance()'. This must be of course removed if the patch is accepted. It prints the results, so it doesn't require an integrated profiler. It takes a few minutes to run... Mentioned that the buffer isn't cleared or flipped in the 'KeyValue.loadValue()' method comment as well. Is that what you meant?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520387/5625v4.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1338//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1338//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1338//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520387/5625v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1338//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1338//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1338//console This message is automatically generated.
        Hide
        Cosmin Lehene added a comment -

        Microbenchmarking is a bit trickier than this. Perhaps we should mention some guideline it in the HBase developer documentation.

        There's a lot of stuff going on in the HotSpot (e.g. dynamic compilation) that needs to be taken into account.

        Here are some resources:
        http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
        https://wikis.oracle.com/display/HotSpotInternals/MicroBenchmarks

        We might be able to use something like junit-benchmarks.
        http://labs.carrotsearch.com/junit-benchmarks.html

        Cosmin

        Show
        Cosmin Lehene added a comment - Microbenchmarking is a bit trickier than this. Perhaps we should mention some guideline it in the HBase developer documentation. There's a lot of stuff going on in the HotSpot (e.g. dynamic compilation) that needs to be taken into account. Here are some resources: http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation https://wikis.oracle.com/display/HotSpotInternals/MicroBenchmarks We might be able to use something like junit-benchmarks. http://labs.carrotsearch.com/junit-benchmarks.html Cosmin
        Hide
        Ted Yu added a comment -

        Here is output from TestResult#testPerformance:

        loadValue(): 122281
        
        getValue():  141426
        

        Improvement was about 13.5%

        Show
        Ted Yu added a comment - Here is output from TestResult#testPerformance: loadValue(): 122281 getValue(): 141426 Improvement was about 13.5%
        Hide
        Ted Yu added a comment -

        I have put some comments here: https://reviews.apache.org/r/4559/

        @Tudor:
        Feel free to create your own review request for the new patch.

        Show
        Ted Yu added a comment - I have put some comments here: https://reviews.apache.org/r/4559/ @Tudor: Feel free to create your own review request for the new patch.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/
        -----------------------------------------------------------

        Review request for hbase.

        Summary
        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.
        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs


        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f
        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef
        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing
        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        Tudor Scurtu added a comment -

        @Zhihong:
        Thanks for the review request. I actually had to make my own in order to upload the diff: https://reviews.apache.org/r/4607/

        The performance actually depends on the system capabilities. It's hard to write a microbenchmark test for an issue that manifests itself on large I/O intensive jobs that put a lot of gc pressure. I implemented a few of Cosmin's suggestions.

        Show
        Tudor Scurtu added a comment - @Zhihong: Thanks for the review request. I actually had to make my own in order to upload the diff: https://reviews.apache.org/r/4607/ The performance actually depends on the system capabilities. It's hard to write a microbenchmark test for an issue that manifests itself on large I/O intensive jobs that put a lot of gc pressure. I implemented a few of Cosmin's suggestions.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520978/5625v5.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.mapreduce.TestTableMapReduce

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520978/5625v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//console This message is automatically generated.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6622
        -----------------------------------------------------------

        I ran TestTableMapReduce and TestMultithreadedTableMapper with patch v5.
        They passed.

        Some minor comments below.

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14303>

        Please include vlength in the exception message

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14304>

        Should read 'BufferOverflowException if there'

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14305>

        Add a space between comma and fl.

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14306>

        Is this comment needed ?

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14307>

        This line can be removed.

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java
        <https://reviews.apache.org/r/4607/#comment14308>

        white space.

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java
        <https://reviews.apache.org/r/4607/#comment14309>

        Since benchmarking is hard to do, this test case can be dropped.

        • Ted

        On 2012-04-02 14:22:48, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-02 14:22:48)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6622 ----------------------------------------------------------- I ran TestTableMapReduce and TestMultithreadedTableMapper with patch v5. They passed. Some minor comments below. src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14303 > Please include vlength in the exception message src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14304 > Should read 'BufferOverflowException if there' src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14305 > Add a space between comma and fl. src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14306 > Is this comment needed ? src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14307 > This line can be removed. src/test/java/org/apache/hadoop/hbase/client/TestResult.java < https://reviews.apache.org/r/4607/#comment14308 > white space. src/test/java/org/apache/hadoop/hbase/client/TestResult.java < https://reviews.apache.org/r/4607/#comment14309 > Since benchmarking is hard to do, this test case can be dropped. Ted On 2012-04-02 14:22:48, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-02 14:22:48) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6623
        -----------------------------------------------------------

        Looks good. Some comments below.

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14311>

        How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer?

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14312>

        I like this refactoring.

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14313>

        How do I know the buffer is big enough?

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14315>

        Why not create a ByteBuffer? or called ByteBuffer wrap? Why not call it toByteBuffer?

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14318>

        Is this an override?

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14317>

        Why pull out all these values here? Maybe we'll fail the first test (q1 is not used by the first test and these extractions cost.. they create byte arrays...)

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14319>

        How do I know the buffer is big enough? Should there be an override that takes an offset?

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14320>

        Please follow the formatting that is present in the rest of the file. Notice placement of exceptions and params. Do not add your own style. Thanks.

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment14321>

        This is a critical base class. I'm nervous when it gets refactored. You have tests for each of your changes? And all the old KV tests pass?

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14322>

        This should be lazily instantiated

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14323>

        Please follow convention that the rest of the file has.

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14324>

        ditto

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14325>

        This comment should be on the @return javadoc rather than here.

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14326>

        Rename hasContent or isNotEmpty or isNotEmptyColumn

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14327>

        Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !?

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment14328>

        Rename hasColumn or isColumn.

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java
        <https://reviews.apache.org/r/4607/#comment14310>

        Or leave it and change the name of the test so it doesn't have the test prefix. Add a main and have it call this. I think the method useful. Might as well keep it.

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java
        <https://reviews.apache.org/r/4607/#comment14329>

        I think there needs to be tests for the new KV functionality because KV is a fundamental type and we can't mess it up.

        • Michael

        On 2012-04-02 14:22:48, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-02 14:22:48)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6623 ----------------------------------------------------------- Looks good. Some comments below. src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14311 > How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer? src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14312 > I like this refactoring. src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14313 > How do I know the buffer is big enough? src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14315 > Why not create a ByteBuffer? or called ByteBuffer wrap? Why not call it toByteBuffer? src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14318 > Is this an override? src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14317 > Why pull out all these values here? Maybe we'll fail the first test (q1 is not used by the first test and these extractions cost.. they create byte arrays...) src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14319 > How do I know the buffer is big enough? Should there be an override that takes an offset? src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14320 > Please follow the formatting that is present in the rest of the file. Notice placement of exceptions and params. Do not add your own style. Thanks. src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment14321 > This is a critical base class. I'm nervous when it gets refactored. You have tests for each of your changes? And all the old KV tests pass? src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14322 > This should be lazily instantiated src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14323 > Please follow convention that the rest of the file has. src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14324 > ditto src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14325 > This comment should be on the @return javadoc rather than here. src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14326 > Rename hasContent or isNotEmpty or isNotEmptyColumn src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14327 > Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !? src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment14328 > Rename hasColumn or isColumn. src/test/java/org/apache/hadoop/hbase/client/TestResult.java < https://reviews.apache.org/r/4607/#comment14310 > Or leave it and change the name of the test so it doesn't have the test prefix. Add a main and have it call this. I think the method useful. Might as well keep it. src/test/java/org/apache/hadoop/hbase/client/TestResult.java < https://reviews.apache.org/r/4607/#comment14329 > I think there needs to be tests for the new KV functionality because KV is a fundamental type and we can't mess it up. Michael On 2012-04-02 14:22:48, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-02 14:22:48) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        Tudor Scurtu added a comment -

        Implemented coding style comments.
        Added separate unit tests for all modified or added methods.

        Show
        Tudor Scurtu added a comment - Implemented coding style comments. Added separate unit tests for all modified or added methods.
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-04-02 16:50:48, Ted Yu wrote:

        > I ran TestTableMapReduce and TestMultithreadedTableMapper with patch v5.

        > They passed.

        >

        > Some minor comments below.

        Thanks! Fixed all except Result#259 and TestResult#117.

        On 2012-04-02 16:50:48, Ted Yu wrote:

        > src/test/java/org/apache/hadoop/hbase/client/TestResult.java, line 117

        > <https://reviews.apache.org/r/4607/diff/1/?file=97956#file97956line117>

        >

        > Since benchmarking is hard to do, this test case can be dropped.

        Refactored as 'doReadBenchmark()'; called from 'main()'.

        On 2012-04-02 16:50:48, Ted Yu wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 259

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line259>

        >

        > Is this comment needed ?

        This was copied from original. Should I remove both occurences?

        • Tudor

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6622
        -----------------------------------------------------------

        On 2012-04-02 14:22:48, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-02 14:22:48)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-04-02 16:50:48, Ted Yu wrote: > I ran TestTableMapReduce and TestMultithreadedTableMapper with patch v5. > They passed. > > Some minor comments below. Thanks! Fixed all except Result#259 and TestResult#117. On 2012-04-02 16:50:48, Ted Yu wrote: > src/test/java/org/apache/hadoop/hbase/client/TestResult.java, line 117 > < https://reviews.apache.org/r/4607/diff/1/?file=97956#file97956line117 > > > Since benchmarking is hard to do, this test case can be dropped. Refactored as 'doReadBenchmark()'; called from 'main()'. On 2012-04-02 16:50:48, Ted Yu wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 259 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line259 > > > Is this comment needed ? This was copied from original. Should I remove both occurences? Tudor ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6622 ----------------------------------------------------------- On 2012-04-02 14:22:48, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-02 14:22:48) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > Looks good. Some comments below.

        Thanks! Please read the changes below.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 531

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line531>

        >

        > How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer?

        Added overload. Added exception comment for when there is insufficient space remaining in the buffer.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1443

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1443>

        >

        > Is this an override?

        Yes. Modified the original method to call the new one with default parameters.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1448

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1448>

        >

        > Why pull out all these values here? Maybe we'll fail the first test (q1 is not used by the first test and these extractions cost.. they create byte arrays...)

        Moved.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 2001

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line2001>

        >

        > How do I know the buffer is big enough? Should there be an override that takes an offset?

        Added buffer offset parameter. We know exactly how many bytes we have to write and how much free space we have in the buffer. An 'IllegalArgumentException' is thrown when there isn't enough free space. Is that what you were asking?

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 2002

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line2002>

        >

        > Please follow the formatting that is present in the rest of the file. Notice placement of exceptions and params. Do not add your own style. Thanks.

        Hope I fixed this.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 251

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line251>

        >

        > Please follow convention that the rest of the file has.

        Hope I fixed this.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 255

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line255>

        >

        > ditto

        Hope I fixed this.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 478

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line478>

        >

        > Rename hasColumn or isColumn.

        This overloads an original method. A refactoring here would be a major non-backwards compatible change.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 398

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line398>

        >

        > Rename hasContent or isNotEmpty or isNotEmptyColumn

        This method was named to mirror 'containsColumn()'. See below.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 431

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line431>

        >

        > Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !?

        They are not complementary.
        containsColumn = value exists
        containsEmptyColumn = value exists & is empty byte array
        containsNonEmptyColumn = value exists & is not empty byte array
        The value could be missing, in which case all methods would return false.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/test/java/org/apache/hadoop/hbase/client/TestResult.java, line 117

        > <https://reviews.apache.org/r/4607/diff/1/?file=97956#file97956line117>

        >

        > Or leave it and change the name of the test so it doesn't have the test prefix. Add a main and have it call this. I think the method useful. Might as well keep it.

        Refactored as 'doReadBenchmark()'; called from 'main()';

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 2056

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line2056>

        >

        > This is a critical base class. I'm nervous when it gets refactored. You have tests for each of your changes? And all the old KV tests pass?

        Added checks in 'TestKeyValue.testFirstLastOnRow()' for the new 'KeyValue.createFirstOnRow()' methods.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616>

        >

        > How do I know the buffer is big enough?

        Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant?

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 297

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line297>

        >

        > This comment should be on the @return javadoc rather than here.

        This was copied from original. Should I remove both occurences?

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/test/java/org/apache/hadoop/hbase/client/TestResult.java, line 178

        > <https://reviews.apache.org/r/4607/diff/1/?file=97956#file97956line178>

        >

        > I think there needs to be tests for the new KV functionality because KV is a fundamental type and we can't mess it up.

        Moved 'loadValue()' checks to their own test methods. Also took the liberty to move 'getColumn()' checks in the same manner in order to keep consistency.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1138

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1138>

        >

        > Why not create a ByteBuffer? or called ByteBuffer wrap? Why not call it toByteBuffer?

        You need to be able to pass your own buffer when you have to compose multiple values.
        I added a new method that uses 'ByteBuffer.wrap()'. I feel that this method should contain the word 'value' in its name so as not to create the impression that it is writing the entire underlying 'KeyValue' structure to the buffer, so I called it 'getValueAsByteBuffer()'. Please comment if it is inadequate.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 87

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line87>

        >

        > This should be lazily instantiated

        Fixed.

        • Tudor

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6623
        -----------------------------------------------------------

        On 2012-04-02 14:22:48, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-02 14:22:48)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-04-02 17:34:38, Michael Stack wrote: > Looks good. Some comments below. Thanks! Please read the changes below. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 531 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line531 > > > How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer? Added overload. Added exception comment for when there is insufficient space remaining in the buffer. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1443 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1443 > > > Is this an override? Yes. Modified the original method to call the new one with default parameters. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1448 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1448 > > > Why pull out all these values here? Maybe we'll fail the first test (q1 is not used by the first test and these extractions cost.. they create byte arrays...) Moved. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 2001 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line2001 > > > How do I know the buffer is big enough? Should there be an override that takes an offset? Added buffer offset parameter. We know exactly how many bytes we have to write and how much free space we have in the buffer. An 'IllegalArgumentException' is thrown when there isn't enough free space. Is that what you were asking? On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 2002 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line2002 > > > Please follow the formatting that is present in the rest of the file. Notice placement of exceptions and params. Do not add your own style. Thanks. Hope I fixed this. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 251 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line251 > > > Please follow convention that the rest of the file has. Hope I fixed this. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 255 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line255 > > > ditto Hope I fixed this. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 478 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line478 > > > Rename hasColumn or isColumn. This overloads an original method. A refactoring here would be a major non-backwards compatible change. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 398 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line398 > > > Rename hasContent or isNotEmpty or isNotEmptyColumn This method was named to mirror 'containsColumn()'. See below. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 431 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line431 > > > Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !? They are not complementary. containsColumn = value exists containsEmptyColumn = value exists & is empty byte array containsNonEmptyColumn = value exists & is not empty byte array The value could be missing, in which case all methods would return false. On 2012-04-02 17:34:38, Michael Stack wrote: > src/test/java/org/apache/hadoop/hbase/client/TestResult.java, line 117 > < https://reviews.apache.org/r/4607/diff/1/?file=97956#file97956line117 > > > Or leave it and change the name of the test so it doesn't have the test prefix. Add a main and have it call this. I think the method useful. Might as well keep it. Refactored as 'doReadBenchmark()'; called from 'main()'; On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 2056 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line2056 > > > This is a critical base class. I'm nervous when it gets refactored. You have tests for each of your changes? And all the old KV tests pass? Added checks in 'TestKeyValue.testFirstLastOnRow()' for the new 'KeyValue.createFirstOnRow()' methods. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616 > > > How do I know the buffer is big enough? Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant? On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 297 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line297 > > > This comment should be on the @return javadoc rather than here. This was copied from original. Should I remove both occurences? On 2012-04-02 17:34:38, Michael Stack wrote: > src/test/java/org/apache/hadoop/hbase/client/TestResult.java, line 178 > < https://reviews.apache.org/r/4607/diff/1/?file=97956#file97956line178 > > > I think there needs to be tests for the new KV functionality because KV is a fundamental type and we can't mess it up. Moved 'loadValue()' checks to their own test methods. Also took the liberty to move 'getColumn()' checks in the same manner in order to keep consistency. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1138 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1138 > > > Why not create a ByteBuffer? or called ByteBuffer wrap? Why not call it toByteBuffer? You need to be able to pass your own buffer when you have to compose multiple values. I added a new method that uses 'ByteBuffer.wrap()'. I feel that this method should contain the word 'value' in its name so as not to create the impression that it is writing the entire underlying 'KeyValue' structure to the buffer, so I called it 'getValueAsByteBuffer()'. Please comment if it is inadequate. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 87 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line87 > > > This should be lazily instantiated Fixed. Tudor ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6623 ----------------------------------------------------------- On 2012-04-02 14:22:48, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-02 14:22:48) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/
        -----------------------------------------------------------

        (Updated 2012-04-04 17:08:03.581526)

        Review request for hbase.

        Changes
        -------

        Implemented coding style comments.
        Added separate unit tests for all modified or added methods.

        Summary
        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.
        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs (updated)


        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f
        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902
        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing
        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-04 17:08:03.581526) Review request for hbase. Changes ------- Implemented coding style comments. Added separate unit tests for all modified or added methods. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs (updated) src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902 src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12521342/5625v6.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1386//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1386//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1386//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521342/5625v6.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1386//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1386//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1386//console This message is automatically generated.
        Hide
        Tudor Scurtu added a comment -

        @Zhihing, Stack:
        What do you think about the new changes?

        Tudor

        Show
        Tudor Scurtu added a comment - @Zhihing, Stack: What do you think about the new changes? Tudor
        Hide
        Ted Yu added a comment -

        @Stack:
        Can you give Tudor some advice ?

        Show
        Ted Yu added a comment - @Stack: Can you give Tudor some advice ?
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 531

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line531>

        >

        > How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer?

        Tudor Scurtu wrote:

        Added overload. Added exception comment for when there is insufficient space remaining in the buffer.

        So, the way this works, we just allocate N and hope that stuff fits inside N? If it doesn't we throw an exception? There is no correlation between data that comes across and the N allocation?

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616>

        >

        > How do I know the buffer is big enough?

        Tudor Scurtu wrote:

        Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant?

        I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1138

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1138>

        >

        > Why not create a ByteBuffer? or called ByteBuffer wrap? Why not call it toByteBuffer?

        Tudor Scurtu wrote:

        You need to be able to pass your own buffer when you have to compose multiple values.

        I added a new method that uses 'ByteBuffer.wrap()'. I feel that this method should contain the word 'value' in its name so as not to create the impression that it is writing the entire underlying 'KeyValue' structure to the buffer, so I called it 'getValueAsByteBuffer()'. Please comment if it is inadequate.

        Sounds good Tudor.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 297

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line297>

        >

        > This comment should be on the @return javadoc rather than here.

        Tudor Scurtu wrote:

        This was copied from original. Should I remove both occurences?

        Sorry. I did not notice it was problem on original. If you can fix it, that'd be sweet.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 431

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line431>

        >

        > Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !?

        Tudor Scurtu wrote:

        They are not complementary.

        containsColumn = value exists

        containsEmptyColumn = value exists & is empty byte array

        containsNonEmptyColumn = value exists & is not empty byte array

        The value could be missing, in which case all methods would return false.

        OK. If not clear from comments, please add your notes above. Will help those that come after.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 478

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line478>

        >

        > Rename hasColumn or isColumn.

        Tudor Scurtu wrote:

        This overloads an original method. A refactoring here would be a major non-backwards compatible change.

        Ok. Thanks for pointing this out.

        • Michael

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6623
        -----------------------------------------------------------

        On 2012-04-04 17:08:03, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-04 17:08:03)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 531 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line531 > > > How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer? Tudor Scurtu wrote: Added overload. Added exception comment for when there is insufficient space remaining in the buffer. So, the way this works, we just allocate N and hope that stuff fits inside N? If it doesn't we throw an exception? There is no correlation between data that comes across and the N allocation? On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616 > > > How do I know the buffer is big enough? Tudor Scurtu wrote: Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant? I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1138 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line1138 > > > Why not create a ByteBuffer? or called ByteBuffer wrap? Why not call it toByteBuffer? Tudor Scurtu wrote: You need to be able to pass your own buffer when you have to compose multiple values. I added a new method that uses 'ByteBuffer.wrap()'. I feel that this method should contain the word 'value' in its name so as not to create the impression that it is writing the entire underlying 'KeyValue' structure to the buffer, so I called it 'getValueAsByteBuffer()'. Please comment if it is inadequate. Sounds good Tudor. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 297 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line297 > > > This comment should be on the @return javadoc rather than here. Tudor Scurtu wrote: This was copied from original. Should I remove both occurences? Sorry. I did not notice it was problem on original. If you can fix it, that'd be sweet. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 431 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line431 > > > Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !? Tudor Scurtu wrote: They are not complementary. containsColumn = value exists containsEmptyColumn = value exists & is empty byte array containsNonEmptyColumn = value exists & is not empty byte array The value could be missing, in which case all methods would return false. OK. If not clear from comments, please add your notes above. Will help those that come after. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 478 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line478 > > > Rename hasColumn or isColumn. Tudor Scurtu wrote: This overloads an original method. A refactoring here would be a major non-backwards compatible change. Ok. Thanks for pointing this out. Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6623 ----------------------------------------------------------- On 2012-04-04 17:08:03, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-04 17:08:03) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902 src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        Tudor Scurtu added a comment -

        Added check with reallocation in 'Result.binarySearch()'.

        Show
        Tudor Scurtu added a comment - Added check with reallocation in 'Result.binarySearch()'.
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 531

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line531>

        >

        > How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer?

        Tudor Scurtu wrote:

        Added overload. Added exception comment for when there is insufficient space remaining in the buffer.

        Michael Stack wrote:

        So, the way this works, we just allocate N and hope that stuff fits inside N? If it doesn't we throw an exception? There is no correlation between data that comes across and the N allocation?

        Please see below.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616>

        >

        > How do I know the buffer is big enough?

        Tudor Scurtu wrote:

        Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant?

        Michael Stack wrote:

        I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match.

        Added check with reallocation in 'Result.binarySearch()'. For this I had to add two methods in 'KeyValue' that calculate the number of bytes that are taken up in a 'KeyValue' object's underlying buffer ('getKeyValueDataStructureSize()' and 'getKeyDataStructureSize()'). Is this ok, and if so, how about replacing all manual calculations of these values in the project with calls to the new methods?

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 297

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line297>

        >

        > This comment should be on the @return javadoc rather than here.

        Tudor Scurtu wrote:

        This was copied from original. Should I remove both occurences?

        Michael Stack wrote:

        Sorry. I did not notice it was problem on original. If you can fix it, that'd be sweet.

        Fixed.

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 431

        > <https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line431>

        >

        > Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !?

        Tudor Scurtu wrote:

        They are not complementary.

        containsColumn = value exists

        containsEmptyColumn = value exists & is empty byte array

        containsNonEmptyColumn = value exists & is not empty byte array

        The value could be missing, in which case all methods would return false.

        Michael Stack wrote:

        OK. If not clear from comments, please add your notes above. Will help those that come after.

        Added.

        • Tudor

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6623
        -----------------------------------------------------------

        On 2012-04-04 17:08:03, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-04 17:08:03)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 531 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line531 > > > How do we know this buffer is big enough? Maybe should add an override that takes an offset into the buffer? Tudor Scurtu wrote: Added overload. Added exception comment for when there is insufficient space remaining in the buffer. Michael Stack wrote: So, the way this works, we just allocate N and hope that stuff fits inside N? If it doesn't we throw an exception? There is no correlation between data that comes across and the N allocation? Please see below. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616 > > > How do I know the buffer is big enough? Tudor Scurtu wrote: Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant? Michael Stack wrote: I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match. Added check with reallocation in 'Result.binarySearch()'. For this I had to add two methods in 'KeyValue' that calculate the number of bytes that are taken up in a 'KeyValue' object's underlying buffer ('getKeyValueDataStructureSize()' and 'getKeyDataStructureSize()'). Is this ok, and if so, how about replacing all manual calculations of these values in the project with calls to the new methods? On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 297 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line297 > > > This comment should be on the @return javadoc rather than here. Tudor Scurtu wrote: This was copied from original. Should I remove both occurences? Michael Stack wrote: Sorry. I did not notice it was problem on original. If you can fix it, that'd be sweet. Fixed. On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 431 > < https://reviews.apache.org/r/4607/diff/1/?file=97955#file97955line431 > > > Why we have both isNonEmptyColumn and isEmptyColumn? Why not just one and then check return with a !? Tudor Scurtu wrote: They are not complementary. containsColumn = value exists containsEmptyColumn = value exists & is empty byte array containsNonEmptyColumn = value exists & is not empty byte array The value could be missing, in which case all methods would return false. Michael Stack wrote: OK. If not clear from comments, please add your notes above. Will help those that come after. Added. Tudor ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6623 ----------------------------------------------------------- On 2012-04-04 17:08:03, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-04 17:08:03) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902 src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/
        -----------------------------------------------------------

        (Updated 2012-04-25 16:01:29.035293)

        Review request for hbase.

        Changes
        -------

        Added check with reallocation in 'Result.binarySearch()'.

        Summary
        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.
        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs (updated)


        src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02
        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df
        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing
        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-25 16:01:29.035293) Review request for hbase. Changes ------- Added check with reallocation in 'Result.binarySearch()'. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs (updated) src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02 src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12524295/5625v7.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.util.TestHBaseFsck

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1646//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1646//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1646//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12524295/5625v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1646//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1646//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1646//console This message is automatically generated.
        Hide
        Jimmy Xiang added a comment -

        @Ted, the 13% performance gain is from 100000000 iterations.

        The big uncertainty is the unknown size to pre-allocate. Is it possible not to copy of value?
        For example, return an immutable wrap to the original value?

        Show
        Jimmy Xiang added a comment - @Ted, the 13% performance gain is from 100000000 iterations. The big uncertainty is the unknown size to pre-allocate. Is it possible not to copy of value? For example, return an immutable wrap to the original value?
        Hide
        Tudor Scurtu added a comment -

        @Jimmy

        I tried this approach, but the performance is comparable to that of the original API. Nevertheless, I could include this functionality if requested.

        Show
        Tudor Scurtu added a comment - @Jimmy I tried this approach, but the performance is comparable to that of the original API. Nevertheless, I could include this functionality if requested.
        Hide
        Jimmy Xiang added a comment -

        @Tudor, thanks for trying it out. There is no need if the performance is comparable.

        Show
        Jimmy Xiang added a comment - @Tudor, thanks for trying it out. There is no need if the performance is comparable.
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616>

        >

        > How do I know the buffer is big enough?

        Tudor Scurtu wrote:

        Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant?

        Michael Stack wrote:

        I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match.

        Tudor Scurtu wrote:

        Added check with reallocation in 'Result.binarySearch()'. For this I had to add two methods in 'KeyValue' that calculate the number of bytes that are taken up in a 'KeyValue' object's underlying buffer ('getKeyValueDataStructureSize()' and 'getKeyDataStructureSize()'). Is this ok, and if so, how about replacing all manual calculations of these values in the project with calls to the new methods?

        I think I am beginning to understand what you are at (pardon me, I am a little slow). You want to speed up finding KVs in big Results and part of the way in which you do this is reuse of a buffer you keep private in Result. The buffer will not match a specific KV.... usually it'll be too big and if it is too small, you'll allocate a buffer big enough, a new one.

        What locations would you put getKeyValueDataStructureSize into place? For example?

        • Michael

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6623
        -----------------------------------------------------------

        On 2012-04-25 16:01:29, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-25 16:01:29)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616 > > > How do I know the buffer is big enough? Tudor Scurtu wrote: Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant? Michael Stack wrote: I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match. Tudor Scurtu wrote: Added check with reallocation in 'Result.binarySearch()'. For this I had to add two methods in 'KeyValue' that calculate the number of bytes that are taken up in a 'KeyValue' object's underlying buffer ('getKeyValueDataStructureSize()' and 'getKeyDataStructureSize()'). Is this ok, and if so, how about replacing all manual calculations of these values in the project with calls to the new methods? I think I am beginning to understand what you are at (pardon me, I am a little slow). You want to speed up finding KVs in big Results and part of the way in which you do this is reuse of a buffer you keep private in Result. The buffer will not match a specific KV.... usually it'll be too big and if it is too small, you'll allocate a buffer big enough, a new one. What locations would you put getKeyValueDataStructureSize into place? For example? Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6623 ----------------------------------------------------------- On 2012-04-25 16:01:29, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-25 16:01:29) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02 src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review7365
        -----------------------------------------------------------

        Ship it!

        +1 on patch. I have a few comments below. Small potatoes.

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment16250>

        So you have plans to use these elsewhere in the codebase? If so, its ok that they are public.

        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/4607/#comment16251>

        This could be 'as its backing data buffer'? If so, I can address that on commit

        src/main/java/org/apache/hadoop/hbase/client/Result.java
        <https://reviews.apache.org/r/4607/#comment16252>

        Here you are trying to make a smart guess on a buffer size that will serve for multiple invocations on binarySearch? Your hope is that you'll not have to reallocate the buffer the next time you come through here because the buffer should have enough space in it to hold the next random KV that comes through here?

        • Michael

        On 2012-04-25 16:01:29, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-25 16:01:29)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review7365 ----------------------------------------------------------- Ship it! +1 on patch. I have a few comments below. Small potatoes. src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment16250 > So you have plans to use these elsewhere in the codebase? If so, its ok that they are public. src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/4607/#comment16251 > This could be 'as its backing data buffer'? If so, I can address that on commit src/main/java/org/apache/hadoop/hbase/client/Result.java < https://reviews.apache.org/r/4607/#comment16252 > Here you are trying to make a smart guess on a buffer size that will serve for multiple invocations on binarySearch? Your hope is that you'll not have to reallocate the buffer the next time you come through here because the buffer should have enough space in it to hold the next random KV that comes through here? Michael On 2012-04-25 16:01:29, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-25 16:01:29) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02 src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        Lars Hofhansl added a comment -

        This is a pretty big change to what appears only a small overall performance gain and only on the client. I am a bit skeptical. +-0.

        Show
        Lars Hofhansl added a comment - This is a pretty big change to what appears only a small overall performance gain and only on the client. I am a bit skeptical. +-0.
        Hide
        Ted Yu added a comment -

        w.r.t. https://reviews.apache.org/r/4607/#comment16252 :
        For size of 129 (2^n + 1), the actual space allocated this way would be 256. I think it is bigger than what is needed.

        Tudor made the suggestion of padding to the closest multiple of 128 bytes (configurable).

        Show
        Ted Yu added a comment - w.r.t. https://reviews.apache.org/r/4607/#comment16252 : For size of 129 (2^n + 1), the actual space allocated this way would be 256. I think it is bigger than what is needed. Tudor made the suggestion of padding to the closest multiple of 128 bytes (configurable).
        Hide
        Tudor Scurtu added a comment -

        @Lars: The patch leaves the original implementation relatively intact, as for the performance gain YMMV.

        Show
        Tudor Scurtu added a comment - @Lars: The patch leaves the original implementation relatively intact, as for the performance gain YMMV.
        Hide
        Tudor Scurtu added a comment -

        Modified 'Result' private buffer reallocation to pad to a size equal to the smallest multiple of a configurable constant.

        Show
        Tudor Scurtu added a comment - Modified 'Result' private buffer reallocation to pad to a size equal to the smallest multiple of a configurable constant.
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-04-28 23:39:41, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 181

        > <https://reviews.apache.org/r/4607/diff/3/?file=104237#file104237line181>

        >

        > So you have plans to use these elsewhere in the codebase? If so, its ok that they are public.

        Replaced manual calculations of infrastructure sizes with calls to the new methods in 'KeyValue'.

        On 2012-04-28 23:39:41, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 542

        > <https://reviews.apache.org/r/4607/diff/3/?file=104237#file104237line542>

        >

        > This could be 'as its backing data buffer'? If so, I can address that on commit

        Fixed.

        On 2012-04-28 23:39:41, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 256

        > <https://reviews.apache.org/r/4607/diff/3/?file=104238#file104238line256>

        >

        > Here you are trying to make a smart guess on a buffer size that will serve for multiple invocations on binarySearch? Your hope is that you'll not have to reallocate the buffer the next time you come through here because the buffer should have enough space in it to hold the next random KV that comes through here?

        Exactly. The power of 2 implementation could have grown too rapidly, so I modified it to pad to a size equal to the smallest multiple of a configurable constant.

        • Tudor

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review7365
        -----------------------------------------------------------

        On 2012-04-25 16:01:29, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-25 16:01:29)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-04-28 23:39:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 181 > < https://reviews.apache.org/r/4607/diff/3/?file=104237#file104237line181 > > > So you have plans to use these elsewhere in the codebase? If so, its ok that they are public. Replaced manual calculations of infrastructure sizes with calls to the new methods in 'KeyValue'. On 2012-04-28 23:39:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 542 > < https://reviews.apache.org/r/4607/diff/3/?file=104237#file104237line542 > > > This could be 'as its backing data buffer'? If so, I can address that on commit Fixed. On 2012-04-28 23:39:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 256 > < https://reviews.apache.org/r/4607/diff/3/?file=104238#file104238line256 > > > Here you are trying to make a smart guess on a buffer size that will serve for multiple invocations on binarySearch? Your hope is that you'll not have to reallocate the buffer the next time you come through here because the buffer should have enough space in it to hold the next random KV that comes through here? Exactly. The power of 2 implementation could have grown too rapidly, so I modified it to pad to a size equal to the smallest multiple of a configurable constant. Tudor ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review7365 ----------------------------------------------------------- On 2012-04-25 16:01:29, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-25 16:01:29) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02 src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-04-02 17:34:38, Michael Stack wrote:

        > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616

        > <https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616>

        >

        > How do I know the buffer is big enough?

        Tudor Scurtu wrote:

        Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant?

        Michael Stack wrote:

        I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match.

        Tudor Scurtu wrote:

        Added check with reallocation in 'Result.binarySearch()'. For this I had to add two methods in 'KeyValue' that calculate the number of bytes that are taken up in a 'KeyValue' object's underlying buffer ('getKeyValueDataStructureSize()' and 'getKeyDataStructureSize()'). Is this ok, and if so, how about replacing all manual calculations of these values in the project with calls to the new methods?

        Michael Stack wrote:

        I think I am beginning to understand what you are at (pardon me, I am a little slow). You want to speed up finding KVs in big Results and part of the way in which you do this is reuse of a buffer you keep private in Result. The buffer will not match a specific KV.... usually it'll be too big and if it is too small, you'll allocate a buffer big enough, a new one.

        What locations would you put getKeyValueDataStructureSize into place? For example?

        Replaced manual calculations of infrastructure sizes with calls to the new methods in 'KeyValue'.

        • Tudor

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/#review6623
        -----------------------------------------------------------

        On 2012-04-25 16:01:29, Tudor Scurtu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4607/

        -----------------------------------------------------------

        (Updated 2012-04-25 16:01:29)

        Review request for hbase.

        Summary

        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.

        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs

        -----

        src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02

        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef

        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df

        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing

        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-04-02 17:34:38, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616 > < https://reviews.apache.org/r/4607/diff/1/?file=97954#file97954line616 > > > How do I know the buffer is big enough? Tudor Scurtu wrote: Added exception comment for when there is insufficient space remaining in the buffer. Is that what you meant? Michael Stack wrote: I am not understanding how the allocation works. It seems arbitrary unrelated to the actual result size that comes over from the server. Is that so? If so, it seems unfriendly throwing an exception when allocated size and what is returned from the server do not match. Tudor Scurtu wrote: Added check with reallocation in 'Result.binarySearch()'. For this I had to add two methods in 'KeyValue' that calculate the number of bytes that are taken up in a 'KeyValue' object's underlying buffer ('getKeyValueDataStructureSize()' and 'getKeyDataStructureSize()'). Is this ok, and if so, how about replacing all manual calculations of these values in the project with calls to the new methods? Michael Stack wrote: I think I am beginning to understand what you are at (pardon me, I am a little slow). You want to speed up finding KVs in big Results and part of the way in which you do this is reuse of a buffer you keep private in Result. The buffer will not match a specific KV.... usually it'll be too big and if it is too small, you'll allocate a buffer big enough, a new one. What locations would you put getKeyValueDataStructureSize into place? For example? Replaced manual calculations of infrastructure sizes with calls to the new methods in 'KeyValue'. Tudor ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6623 ----------------------------------------------------------- On 2012-04-25 16:01:29, Tudor Scurtu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-04-25 16:01:29) Review request for hbase. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs ----- src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02 src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4607/
        -----------------------------------------------------------

        (Updated 2012-05-02 08:36:45.758830)

        Review request for hbase.

        Changes
        -------

        Modified 'Result' private buffer reallocation to pad to a size equal to the smallest multiple of a configurable constant.

        Summary
        -------

        When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value.

        These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method.

        This addresses bug HBASE-5625.
        https://issues.apache.org/jira/browse/HBASE-5625

        Diffs (updated)


        src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02
        src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df
        src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2

        Diff: https://reviews.apache.org/r/4607/diff

        Testing
        -------

        Added value check to TestResult#testBasic and TestResult.testMultiVersion.

        Thanks,

        Tudor

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ ----------------------------------------------------------- (Updated 2012-05-02 08:36:45.758830) Review request for hbase. Changes ------- Modified 'Result' private buffer reallocation to pad to a size equal to the smallest multiple of a configurable constant. Summary ------- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625 . https://issues.apache.org/jira/browse/HBASE-5625 Diffs (updated) src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02 src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing ------- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12525265/5625v8.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1718//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1718//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1718//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525265/5625v8.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1718//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1718//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1718//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        +1 on patch v8.

        Show
        Ted Yu added a comment - +1 on patch v8.
        Hide
        Ted Yu added a comment -

        Will integrate patch v8 in 3 hours if there is no objection.

        Show
        Ted Yu added a comment - Will integrate patch v8 in 3 hours if there is no objection.
        Hide
        Ted Yu added a comment -

        Patch v8 integrated to trunk.

        Thanks for the patch, Tudor.

        Thanks for the review, Stack.

        Show
        Ted Yu added a comment - Patch v8 integrated to trunk. Thanks for the patch, Tudor. Thanks for the review, Stack.
        Hide
        Tudor Scurtu added a comment -

        Thank you too.

        Tudor

        Show
        Tudor Scurtu added a comment - Thank you too. Tudor
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #190 (See https://builds.apache.org/job/HBase-TRUNK-security/190/)
        HBASE-5625 Avoid byte buffer allocations when reading a value from a Result object (Tudor Scurtu) (Revision 1333159)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Result.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestResult.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #190 (See https://builds.apache.org/job/HBase-TRUNK-security/190/ ) HBASE-5625 Avoid byte buffer allocations when reading a value from a Result object (Tudor Scurtu) (Revision 1333159) Result = SUCCESS tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Result.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestResult.java
        Hide
        stack added a comment -

        Marking closed.

        Show
        stack added a comment - Marking closed.

          People

          • Assignee:
            Tudor Scurtu
            Reporter:
            Tudor Scurtu
          • Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development