HBase
  1. HBase
  2. HBASE-3433

Remove the KV copy of every KV in Scan; introduced by HBASE-3232

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.92.0, 0.94.0
    • Component/s: Performance, regionserver
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed

      Description

      Here is offending code from inside in StoreScanner#next:

            // kv is no longer immutable due to KeyOnlyFilter! use copy for safety
            KeyValue copyKv = new KeyValue(kv.getBuffer(), kv.getOffset(), kv.getLength());
      

      This looks wrong given philosophy up to this has been avoidance of garbage-making copies.

      Maybe this has been looked into before and this is the only thing to be done but why is KeyOnlyFilter not making copies rather than mutating originals?

      Making this critical against 0.92.

      1. 3433.txt
        7 kB
        Lars Hofhansl
      2. 3433-v2.txt
        8 kB
        Lars Hofhansl
      3. 3433-v3.txt
        9 kB
        Lars Hofhansl
      4. HBASE-3433-sidenote.patch
        2 kB
        Nicolas Spiegelberg

        Activity

        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2430 (See https://builds.apache.org/job/HBase-TRUNK/2430/)
        HBASE-3433 Remove the KV copy of every KV in Scan; introduced by HBASE-3232

        larsh :
        Files :

        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/Filter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2430 (See https://builds.apache.org/job/HBase-TRUNK/2430/ ) HBASE-3433 Remove the KV copy of every KV in Scan; introduced by HBASE-3232 larsh : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/Filter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.92 #125 (See https://builds.apache.org/job/HBase-0.92/125/)
        HBASE-3433 Remove the KV copy of every KV in Scan; introduced by HBASE-3232

        larsh :
        Files :

        • /hbase/branches/0.92/CHANGES.txt
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/Filter.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        • /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        Show
        Hudson added a comment - Integrated in HBase-0.92 #125 (See https://builds.apache.org/job/HBase-0.92/125/ ) HBASE-3433 Remove the KV copy of every KV in Scan; introduced by HBASE-3232 larsh : Files : /hbase/branches/0.92/CHANGES.txt /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/Filter.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12503430/3433-v3.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -164 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 50 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/234//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/234//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/234//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12503430/3433-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 50 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/234//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/234//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/234//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        Committed to 0.92 and trunk.
        Will still write a little test tool, maybe I can quantify changed in GC (although with the amount of garbage that is produced, I doubt that even that will be measurable).

        Show
        Lars Hofhansl added a comment - Committed to 0.92 and trunk. Will still write a little test tool, maybe I can quantify changed in GC (although with the amount of garbage that is produced, I doubt that even that will be measurable).
        Hide
        Lars Hofhansl added a comment -

        I don't think the difference will be measurable, was just a shallow copy. The point of this patch was to formalize how a filter would modify the KVs, rather than doing it as a side-effect of the filtering methods.

        I'll write a little test client, I'm willing to bet that there will no measurable improvement.

        Show
        Lars Hofhansl added a comment - I don't think the difference will be measurable, was just a shallow copy. The point of this patch was to formalize how a filter would modify the KVs, rather than doing it as a side-effect of the filtering methods. I'll write a little test client, I'm willing to bet that there will no measurable improvement.
        Hide
        stack added a comment -

        +1 and +1 for 0.92. I'd be really surprised if you could make a test that showed a perf diff; i'd doubt it show on any yardstick. It does cut down objects created which will help in hard-to-directly-measure ways.

        Show
        stack added a comment - +1 and +1 for 0.92. I'd be really surprised if you could make a test that showed a perf diff; i'd doubt it show on any yardstick. It does cut down objects created which will help in hard-to-directly-measure ways.
        Hide
        Ted Yu added a comment -

        I think this can go to 0.92

        Some performance comparison (with / without this patch) would be helpful.

        Show
        Ted Yu added a comment - I think this can go to 0.92 Some performance comparison (with / without this patch) would be helpful.
        Hide
        Lars Hofhansl added a comment -

        Any thought on 0.92? I'd say let's just add it there too.

        Show
        Lars Hofhansl added a comment - Any thought on 0.92? I'd say let's just add it there too.
        Hide
        Ted Yu added a comment -

        +1 on patch v3.

        Show
        Ted Yu added a comment - +1 on patch v3.
        Hide
        Lars Hofhansl added a comment -

        Version with Stack's and Ted's suggestions.

        Show
        Lars Hofhansl added a comment - Version with Stack's and Ted's suggestions.
        Hide
        Lars Hofhansl added a comment -

        Then we should do this for all the other methods in Filter.java as well. I can do another version for that.

        Show
        Lars Hofhansl added a comment - Then we should do this for all the other methods in Filter.java as well. I can do another version for that.
        Hide
        Ted Yu added a comment -

        That's right.
        It serves as a programmatic hint.

        Show
        Ted Yu added a comment - That's right. It serves as a programmatic hint.
        Hide
        Lars Hofhansl added a comment -

        That does not prevent the passed kv from being changed, though. Right?

        Show
        Lars Hofhansl added a comment - That does not prevent the passed kv from being changed, though. Right?
        Hide
        Ted Yu added a comment -

        Should we add final to the parameter v ?

        public KeyValue transform(final KeyValue v);
        
        Show
        Ted Yu added a comment - Should we add final to the parameter v ? public KeyValue transform( final KeyValue v);
        Hide
        Lars Hofhansl added a comment -

        How's this?

          /**
           * Give the filter a chance to transform the passed KeyValue.
           * If the KeyValue is changed a new KeyValue object must be returned.
           * @see org.apache.hadoop.hbase.KeyValue#shallowCopy()
           *
           * The transformed KeyValue is what is eventually returned to the
           * client. Most filters will return the passed KeyValue unchanged.
           * @see org.apache.hadoop.hbase.filter.KeyOnlyFilter#transform(KeyValue)
           * for an example of a transformation.
           *
           * @param v the KeyValue in question
           * @return the changed KeyValue
           */
          public KeyValue transform(KeyValue v);
        
        Show
        Lars Hofhansl added a comment - How's this? /** * Give the filter a chance to transform the passed KeyValue. * If the KeyValue is changed a new KeyValue object must be returned. * @see org.apache.hadoop.hbase.KeyValue#shallowCopy() * * The transformed KeyValue is what is eventually returned to the * client. Most filters will return the passed KeyValue unchanged. * @see org.apache.hadoop.hbase.filter.KeyOnlyFilter#transform(KeyValue) * for an example of a transformation. * * @param v the KeyValue in question * @ return the changed KeyValue */ public KeyValue transform(KeyValue v);
        Hide
        Lars Hofhansl added a comment -

        Good point about "public", does not have to be. That is actually the part of the patch I like the least. SQM now "leaks" the filter, and StoreScanner has know about filters. I was also thinking about adding a transform() method to SQM, that would basically do the same as the added code in StoreScanner; StoreScanner would then call SQM.transform(kv). I think I'll just make this method package private.

        Will update the Javadoc too and post a new patch soon.

        Could add this to 0.92 as well. As I said, Filter and FilterBase should be considered public API, so it is an API change. That said, I'm fine with this in 0.92. It's a slight change as most folks would use FilterBase anyway.

        Show
        Lars Hofhansl added a comment - Good point about "public", does not have to be. That is actually the part of the patch I like the least. SQM now "leaks" the filter, and StoreScanner has know about filters. I was also thinking about adding a transform() method to SQM, that would basically do the same as the added code in StoreScanner; StoreScanner would then call SQM.transform(kv). I think I'll just make this method package private. Will update the Javadoc too and post a new patch soon. Could add this to 0.92 as well. As I said, Filter and FilterBase should be considered public API, so it is an API change. That said, I'm fine with this in 0.92. It's a slight change as most folks would use FilterBase anyway.
        Hide
        stack added a comment -

        +1 on commit. Some minors below that can be addressed on commit. We can't add this to 0.92?

        Does the below have to be public rather than package private?

        +  public Filter getFilter() {
        

        I like your transform addition to the Filter Interface.

        I think you need to say a little more in Filter#transform javadoc; you should say that the transform is what is actually returned to the client and perhaps point at keyonlyfilter as example usage. Filters are hard. Poor fellas trying to make sense of them need all the pointers possible.

        Show
        stack added a comment - +1 on commit. Some minors below that can be addressed on commit. We can't add this to 0.92? Does the below have to be public rather than package private? + public Filter getFilter() { I like your transform addition to the Filter Interface. I think you need to say a little more in Filter#transform javadoc; you should say that the transform is what is actually returned to the client and perhaps point at keyonlyfilter as example usage. Filters are hard. Poor fellas trying to make sense of them need all the pointers possible.
        Hide
        Ted Yu added a comment -

        +1 on patch v2.

        Show
        Ted Yu added a comment - +1 on patch v2.
        Hide
        Lars Hofhansl added a comment -

        All tests pass. I don't think this needs any new tests.

        It's an incompatible change. If I get a few +1's I will check this in.

        Show
        Lars Hofhansl added a comment - All tests pass. I don't think this needs any new tests. It's an incompatible change. If I get a few +1's I will check this in.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12503321/3433-v2.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -164 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 50 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/231//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/231//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/231//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12503321/3433-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 50 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/231//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/231//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/231//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        Let's get a test run

        Show
        Lars Hofhansl added a comment - Let's get a test run
        Hide
        Ted Yu added a comment -

        Nice patch, Lars.

        Show
        Ted Yu added a comment - Nice patch, Lars.
        Hide
        Lars Hofhansl added a comment -

        Slightly better patch. (does away with convertToKeyOnly in place KeyValue modification).

        Show
        Lars Hofhansl added a comment - Slightly better patch. (does away with convertToKeyOnly in place KeyValue modification).
        Hide
        Lars Hofhansl added a comment -

        Something like this.
        (TestFilter and TestFilterList still pass)

        Note that this is an API change, and that it subtly changes how FilterList interacts with KeyOnlyFilter (for the better I think, but, still, it is a change)

        Show
        Lars Hofhansl added a comment - Something like this. (TestFilter and TestFilterList still pass) Note that this is an API change, and that it subtly changes how FilterList interacts with KeyOnlyFilter (for the better I think, but, still, it is a change)
        Hide
        Lars Hofhansl added a comment -

        and

        {FilterList|SkipFilter|WhileMatchFilter}.transform() would need to call their the inner Filter's transform.

        So:
        1. add KeyValue transform(final KeyValue) to Filter (or better name?)
        2. add call to it in StoreScanner.next() if sqm.match returned INCLUDE*
        3. add a default implementation to FilterBase that just returns the passed KV
        4. call transform on inner filters in {FilterList|SkipFilter|WhileMatchFilter}

        .transform()
        5. implement transform in KeyOnlyFilter (i.e. KeyOnlyFilter would do the shallow copy)

        Show
        Lars Hofhansl added a comment - and {FilterList|SkipFilter|WhileMatchFilter}.transform() would need to call their the inner Filter's transform. So: 1. add KeyValue transform(final KeyValue) to Filter (or better name?) 2. add call to it in StoreScanner.next() if sqm.match returned INCLUDE* 3. add a default implementation to FilterBase that just returns the passed KV 4. call transform on inner filters in {FilterList|SkipFilter|WhileMatchFilter} .transform() 5. implement transform in KeyOnlyFilter (i.e. KeyOnlyFilter would do the shallow copy)
        Hide
        Lars Hofhansl added a comment -

        Hmm... Would need to call transform in StoreScanner after we called match and only if the result was INCLUDE*.

        Show
        Lars Hofhansl added a comment - Hmm... Would need to call transform in StoreScanner after we called match and only if the result was INCLUDE*.
        Hide
        Ted Yu added a comment -

        +1 on Lars' proposal.

        Show
        Ted Yu added a comment - +1 on Lars' proposal.
        Hide
        Lars Hofhansl added a comment -

        It makes sense for some filters to transform the KVs passed to them.

        Maybe we should just formalize that:
        1. add KeyValue transform(final KeyValue) to Filter (or better name?)
        2. add call to it in ScanQueryMatcher.match()
        3. add a default implementation to FilterBase that just returns the passed KV
        4. implement that transform in KeyOnlyFilter (i.e. KeyOnlyFilter would do the shallow copy)

        Show
        Lars Hofhansl added a comment - It makes sense for some filters to transform the KVs passed to them. Maybe we should just formalize that: 1. add KeyValue transform(final KeyValue) to Filter (or better name?) 2. add call to it in ScanQueryMatcher.match() 3. add a default implementation to FilterBase that just returns the passed KV 4. implement that transform in KeyOnlyFilter (i.e. KeyOnlyFilter would do the shallow copy)
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #1719 (See https://hudson.apache.org/hudson/job/HBase-TRUNK/1719/)

        Show
        Hudson added a comment - Integrated in HBase-TRUNK #1719 (See https://hudson.apache.org/hudson/job/HBase-TRUNK/1719/ )
        Hide
        Nicolas Spiegelberg added a comment -

        minor API refactoring to add a deepCopy and a shallowCopy function to KV for clarity. I guess you guys are the ones to fix this problem, but I really think that, beyond clarifying this section, the rest of the change for this jira is minor pri.

        Show
        Nicolas Spiegelberg added a comment - minor API refactoring to add a deepCopy and a shallowCopy function to KV for clarity. I guess you guys are the ones to fix this problem, but I really think that, beyond clarifying this section, the rest of the change for this jira is minor pri.
        Hide
        Nicolas Spiegelberg added a comment -

        dj_ryan: nspiegelberg: extra copy of all the queried data does make a big difference
        [12:04pm] dj_ryan: now im goign to have to spend an hour or two profiling to find if it is the case or not
        [12:04pm] nspiegelberg: an extra shallow copy?
        [12:04pm] nspiegelberg: I agree that an extra deep copy does
        [12:04pm] dj_ryan: are you sure that's a shallow copy?
        [12:04pm] nspiegelberg: yes
        [12:04pm] dj_ryan: hmm
        [12:04pm] dj_ryan: that might have minimal impact
        [12:05pm] dj_ryan: i thought it was a deep copy actually.
        [12:05pm] dj_ryan: but you'd be surprised at what can affect performance.
        [12:05pm] nspiegelberg: that brings up my next point...
        [12:05pm] nspiegelberg: I want to add a softCopy() function to KeyValue
        [12:05pm] dj_ryan: which does what
        [12:05pm] dj_ryan: and why woudl we use it
        [12:06pm] nspiegelberg: because dj_ryan doesn't know when we do a shallow copy. I oftentimes don't either. it's just easier to look at the function and know for sure what it does
        [12:06pm] dj_ryan: hmm
        [12:06pm] dj_ryan: ha
        [12:06pm] dj_ryan: it would make sense to have a kv.copy
        [12:06pm] dj_ryan: rather than what the code is doing
        [12:06pm] dj_ryan: what with the this and that
        [12:07pm] nspiegelberg: but kv.copy should do a deepCopy, correct?
        [12:07pm] nspiegelberg: I don't mind 2 deepCopy() and softCopy() functions either
        [12:08pm] dj_ryan: well yes
        [12:08pm] dj_ryan: i guess deep/soft
        [12:08pm] dj_ryan: or whatever might be a better nomenclature
        [12:08pm] nspiegelberg: explicit is best. I chose deep and soft because I've written too much python and both words are short

        Show
        Nicolas Spiegelberg added a comment - dj_ryan: nspiegelberg: extra copy of all the queried data does make a big difference [12:04pm] dj_ryan: now im goign to have to spend an hour or two profiling to find if it is the case or not [12:04pm] nspiegelberg: an extra shallow copy? [12:04pm] nspiegelberg: I agree that an extra deep copy does [12:04pm] dj_ryan: are you sure that's a shallow copy? [12:04pm] nspiegelberg: yes [12:04pm] dj_ryan: hmm [12:04pm] dj_ryan: that might have minimal impact [12:05pm] dj_ryan: i thought it was a deep copy actually. [12:05pm] dj_ryan: but you'd be surprised at what can affect performance. [12:05pm] nspiegelberg: that brings up my next point... [12:05pm] nspiegelberg: I want to add a softCopy() function to KeyValue [12:05pm] dj_ryan: which does what [12:05pm] dj_ryan: and why woudl we use it [12:06pm] nspiegelberg: because dj_ryan doesn't know when we do a shallow copy. I oftentimes don't either. it's just easier to look at the function and know for sure what it does [12:06pm] dj_ryan: hmm [12:06pm] dj_ryan: ha [12:06pm] dj_ryan: it would make sense to have a kv.copy [12:06pm] dj_ryan: rather than what the code is doing [12:06pm] dj_ryan: what with the this and that [12:07pm] nspiegelberg: but kv.copy should do a deepCopy, correct? [12:07pm] nspiegelberg: I don't mind 2 deepCopy() and softCopy() functions either [12:08pm] dj_ryan: well yes [12:08pm] dj_ryan: i guess deep/soft [12:08pm] dj_ryan: or whatever might be a better nomenclature [12:08pm] nspiegelberg: explicit is best. I chose deep and soft because I've written too much python and both words are short
        Hide
        stack added a comment -

        Yeah, maybe its not a drag on perf. but its 'dirty' and would just like to clean up the rot before it digs in deeper (Regards immutability, unfortunately, thats going to have to go now server inserts an edit sequence number into the KV on receipt, hbase-2856)

        Show
        stack added a comment - Yeah, maybe its not a drag on perf. but its 'dirty' and would just like to clean up the rot before it digs in deeper (Regards immutability, unfortunately, thats going to have to go now server inserts an edit sequence number into the KV on receipt, hbase-2856)
        Hide
        Nicolas Spiegelberg added a comment -

        I agree with Stack that a soft copy should only be made when a keyonlyfilter runs, but I think it's a little paranoid to mark this as critical. As Jonathan said, this is a soft copy so it's probably contributing a miniscule amount to any perf drop. The biggest worry is that immutability is broken, not that a copy is made. At best, this is a minor issue that should be addressed during a filtering code refactor.

        Show
        Nicolas Spiegelberg added a comment - I agree with Stack that a soft copy should only be made when a keyonlyfilter runs, but I think it's a little paranoid to mark this as critical. As Jonathan said, this is a soft copy so it's probably contributing a miniscule amount to any perf drop. The biggest worry is that immutability is broken, not that a copy is made. At best, this is a minor issue that should be addressed during a filtering code refactor.
        Hide
        ryan rawson added a comment -

        I agree with stack, extra copies is also extra work done for no reason
        99-100% of the time. could be why some claim 89 faster...
        https://issues.apache.org/jira/browse/HBASE-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979471#action_12979471]
        rare time keyonlyfilter runs.
        doesn't keyonlyfilter make copies rather than mutate – HBASE-3211)?
        ------------------------------------------------------------------------------------------------------------------------------------------
        kv.getLength());
        garbage-making copies.
        done but why is KeyOnlyFilter not making copies rather than mutating
        originals?

        Show
        ryan rawson added a comment - I agree with stack, extra copies is also extra work done for no reason 99-100% of the time. could be why some claim 89 faster... https://issues.apache.org/jira/browse/HBASE-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979471#action_12979471 ] rare time keyonlyfilter runs. doesn't keyonlyfilter make copies rather than mutate – HBASE-3211 )? ------------------------------------------------------------------------------------------------------------------------------------------ kv.getLength()); garbage-making copies. done but why is KeyOnlyFilter not making copies rather than mutating originals?
        Hide
        stack added a comment -

        Why have it at all. We should rejigger stuff so that copies are made the rare time keyonlyfilter runs.

        Show
        stack added a comment - Why have it at all. We should rejigger stuff so that copies are made the rare time keyonlyfilter runs.
        Hide
        Jonathan Gray added a comment -

        This is only a shallow copy.

        Show
        Jonathan Gray added a comment - This is only a shallow copy.

          People

          • Assignee:
            Lars Hofhansl
            Reporter:
            stack
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development