Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-9778

Add hint to ExplicitColumnTracker to avoid seeking

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.96.2, 0.98.1, 0.99.0, 0.94.18
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      Introduces a new scan attribute to allow a scan operation with explicit columns (Scan.addColumn) to opportunistically look ahead a few KeyValues (columns/versions) before scheduling a seek operation to seek between columns.

      A seek is efficient when it can seek past 5-10 KeyValue (columns) or 512-1024 bytes. With small rows and few versions look ahead is typically more efficient.

      API:
      {code}
          Scan s = new Scan(...);
          s.addColumn(...);
          // instructs the RegionServer to attempt two iterations of next before scheduling a seek
          s.setAttribute(Scan.HINT_LOOKAHEAD, Bytes.toBytes(2));
          table.getScanner(s);
      {code}
      Show
      Introduces a new scan attribute to allow a scan operation with explicit columns (Scan.addColumn) to opportunistically look ahead a few KeyValues (columns/versions) before scheduling a seek operation to seek between columns. A seek is efficient when it can seek past 5-10 KeyValue (columns) or 512-1024 bytes. With small rows and few versions look ahead is typically more efficient. API: {code}     Scan s = new Scan(...);     s.addColumn(...);     // instructs the RegionServer to attempt two iterations of next before scheduling a seek     s.setAttribute(Scan.HINT_LOOKAHEAD, Bytes.toBytes(2));     table.getScanner(s); {code}

      Description

      The issue of slow seeking in ExplicitColumnTracker was brought up by Vladimir Rodionov on the dev list.

      My idea here is to avoid the seeking if we know that there aren't many versions to skip.
      How do we know? We'll use the column family's VERSIONS setting as a hint. If VERSIONS is set to 1 (or maybe some value < 10) we'll avoid the seek and call SKIP repeatedly.

      HBASE-9769 has some initial number for this approach:
      Interestingly it depends on which column(s) is (are) selected.

      Some numbers: 4m rows, 5 cols each, 1 cf, 10 bytes values, VERSIONS=1, everything filtered at the server with a ValueFilter. Everything measured in seconds.

      Without patch:

      Wildcard Col 1 Col 2 Col 4 Col 5 Col 2+4
      6.4 8.5 14.3 14.6 11.1 20.3

      With patch:

      Wildcard Col 1 Col 2 Col 4 Col 5 Col 2+4
      6.4 8.4 8.9 9.9 6.4 10.0

      Variation here was +- 0.2s.

      So with this patch scanning is 2x faster than without in some cases, and never slower. No special hint needed, beyond declaring VERSIONS correctly.

      1. 9778-0.94.txt
        0.9 kB
        Lars Hofhansl
      2. 9778-0.94-v2.txt
        6 kB
        Lars Hofhansl
      3. 9778-0.94-v3.txt
        11 kB
        Lars Hofhansl
      4. 9778-0.94-v4.txt
        9 kB
        Lars Hofhansl
      5. 9778-0.94-v5.txt
        6 kB
        Lars Hofhansl
      6. 9778-0.94-v6.txt
        12 kB
        Lars Hofhansl
      7. 9778-0.94-v7.txt
        13 kB
        Lars Hofhansl
      8. 9778-0.94-v8.txt
        14 kB
        Lars Hofhansl
      9. 9778-0.94-v9.txt
        14 kB
        Lars Hofhansl
      10. 9778-trunk.txt
        0.9 kB
        Lars Hofhansl
      11. 9778-trunk-v2.txt
        6 kB
        Lars Hofhansl
      12. 9778-trunk-v3.txt
        11 kB
        Lars Hofhansl
      13. 9778-trunk-v6.txt
        12 kB
        Lars Hofhansl
      14. 9778-trunk-v7.txt
        14 kB
        Lars Hofhansl
      15. 9778-trunk-v8.txt
        14 kB
        Lars Hofhansl
      16. 9778-trunk-v9.txt
        14 kB
        Lars Hofhansl

        Issue Links

          Activity

          Hide
          lhofhansl Lars Hofhansl added a comment -

          Note that this avoid the NEXT_COL but not the NEXT_ROW seeks. Here we have nothing to go by to know whether a row is going to be large (many columns) or not, and hence we'd need to hint the scan explicitly somehow and/or use a filter as Vladimir suggests in HBASE-9769.

          Show
          lhofhansl Lars Hofhansl added a comment - Note that this avoid the NEXT_COL but not the NEXT_ROW seeks. Here we have nothing to go by to know whether a row is going to be large (many columns) or not, and hence we'd need to hint the scan explicitly somehow and/or use a filter as Vladimir suggests in HBASE-9769 .
          Hide
          lhofhansl Lars Hofhansl added a comment -

          0.94 patch.

          Show
          lhofhansl Lars Hofhansl added a comment - 0.94 patch.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          And a trunk version for HadoopQA

          Show
          lhofhansl Lars Hofhansl added a comment - And a trunk version for HadoopQA
          Hide
          stack stack added a comment -

          Nice numbers for such a little code change.

          I don't know this code well.

          Looking at it, doesn't SKIP mean 'do not include this KV in the result'? So if maxVersion == 1, it means we already have a KV and we are saying 'do not include this result' for all subsequent seen columns rather than call a seek? What if many versions of this row. Now we are not seeking, we could spend a bunch of time having to skip all values?

          Am I groking this right?

          Show
          stack stack added a comment - Nice numbers for such a little code change. I don't know this code well. Looking at it, doesn't SKIP mean 'do not include this KV in the result'? So if maxVersion == 1, it means we already have a KV and we are saying 'do not include this result' for all subsequent seen columns rather than call a seek? What if many versions of this row. Now we are not seeking, we could spend a bunch of time having to skip all values? Am I groking this right?
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12608659/9778-trunk.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 site. The patch appears to cause mvn site goal to fail.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestExplicitColumnTracker
          org.apache.hadoop.hbase.regionserver.TestQueryMatcher

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12608659/9778-trunk.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestExplicitColumnTracker org.apache.hadoop.hbase.regionserver.TestQueryMatcher Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7560//console This message is automatically generated.
          Hide
          jmspaggi Jean-Marc Spaggiari added a comment -

          Can TestExplicitColumnTracker be related to this modification? I will guess so. But most probably it's the test we need to modifty, and not ExplicitColumnTracker?

          Passed with 0.94.12 without the patch and failed with the patch.

          Show
          jmspaggi Jean-Marc Spaggiari added a comment - Can TestExplicitColumnTracker be related to this modification? I will guess so. But most probably it's the test we need to modifty, and not ExplicitColumnTracker? Passed with 0.94.12 without the patch and failed with the patch.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          stack, skip means that it'll skip this particular KV (i.e. this version of a column).
          Jean-Marc Spaggiari, yeah, the test failure is probably related Will take a look today.

          I also realized yesterday that the patch is not quite correct anyway. What the tracker needs to know is maxVersions of the ColumnFamily not maxVersions of this particular scan (which is min of the scan' and CF's setting).

          Show
          lhofhansl Lars Hofhansl added a comment - stack , skip means that it'll skip this particular KV (i.e. this version of a column). Jean-Marc Spaggiari , yeah, the test failure is probably related Will take a look today. I also realized yesterday that the patch is not quite correct anyway. What the tracker needs to know is maxVersions of the ColumnFamily not maxVersions of this particular scan (which is min of the scan' and CF's setting).
          Hide
          lhofhansl Lars Hofhansl added a comment -

          stack, didn't read your comment right the first time.

          • SKIP skips only a single KV (i.e. a single version of a column, not a row), so we'll avoid the reseek at the cost of having to SKIP the next version again if there is one.
          • If there were many versions we'd do extra work in having to SKIP each of them individually. For that reason I am (or will be) using the CF's MAX_VERSIONS as a hint. If that is set to 1 (or we could take a small number), we can assume that we won't see many versions here.
          Show
          lhofhansl Lars Hofhansl added a comment - stack , didn't read your comment right the first time. SKIP skips only a single KV (i.e. a single version of a column, not a row), so we'll avoid the reseek at the cost of having to SKIP the next version again if there is one. If there were many versions we'd do extra work in having to SKIP each of them individually. For that reason I am (or will be) using the CF's MAX_VERSIONS as a hint. If that is set to 1 (or we could take a small number), we can assume that we won't see many versions here.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          New 0.94 version.

          • fixes TestExplicitColumnTracker
          • adds a simple test case, to make sure it works correctly if there multiple versions in the store
          • Uses the store's (CF's) MAX_VERSIONS setting as the hint
          Show
          lhofhansl Lars Hofhansl added a comment - New 0.94 version. fixes TestExplicitColumnTracker adds a simple test case, to make sure it works correctly if there multiple versions in the store Uses the store's (CF's) MAX_VERSIONS setting as the hint
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Same for trunk.

          Looking around more. Seems there are more seeks that can be avoided if we know there won't be many versions around.

          Show
          lhofhansl Lars Hofhansl added a comment - Same for trunk. Looking around more. Seems there are more seeks that can be avoided if we know there won't be many versions around.
          Hide
          jesse_yates Jesse Yates added a comment -

          Feels like we should be smarter here:

          + // is interested in, so skip or seek to that column of interest.
          + return storeMaxVersions == 1 ? ScanQueryMatcher.MatchCode.SKIP
          + : ScanQueryMatcher.MatchCode.SEEK_NEXT_COL;

          Why not keep a count of the versions seen and only skip past if approaching (at?) the number of max versions? Shouldn't need any synchronization on that member variable, so access should be fast

          Show
          jesse_yates Jesse Yates added a comment - Feels like we should be smarter here: + // is interested in, so skip or seek to that column of interest. + return storeMaxVersions == 1 ? ScanQueryMatcher.MatchCode.SKIP + : ScanQueryMatcher.MatchCode.SEEK_NEXT_COL; Why not keep a count of the versions seen and only skip past if approaching (at?) the number of max versions? Shouldn't need any synchronization on that member variable, so access should be fast
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12608760/9778-trunk-v2.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 site. The patch appears to cause mvn site goal to fail.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestQueryMatcher

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12608760/9778-trunk-v2.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestQueryMatcher Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7569//console This message is automatically generated.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Jesse Yates, unfortunately we're skipping the column before we know the version count. What we could do is count versions of column that were not selected, and use that as an estimate for the expected number of versions of the selected columns. This seems easier and more reliable.

          Show
          lhofhansl Lars Hofhansl added a comment - Jesse Yates , unfortunately we're skipping the column before we know the version count. What we could do is count versions of column that were not selected, and use that as an estimate for the expected number of versions of the selected columns. This seems easier and more reliable.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Arggh... Didn't see TestQueryMatcher before. Fixed that test as well, and added a simple that tests with 2 versions (to verify the existing behavior).

          Show
          lhofhansl Lars Hofhansl added a comment - Arggh... Didn't see TestQueryMatcher before. Fixed that test as well, and added a simple that tests with 2 versions (to verify the existing behavior).
          Hide
          lhofhansl Lars Hofhansl added a comment -

          And trunk

          Show
          lhofhansl Lars Hofhansl added a comment - And trunk
          Hide
          stack stack added a comment -

          Lars Hofhansl You think the VERSIONS config an indicator of how many absolute versions of a particular column? I can think of pathological cases where hbase is being used to keep say a queue where client is only interested in most recent cell but many could be writing to the one coordinate. In this case, we'd want to seek to the next column rather than skip, skip, skip, right? Could we do something like the Jesse heuristic where we keep a count and skip the first few but then switch to a seek if it looks like we are on a column of many versions?

          Show
          stack stack added a comment - Lars Hofhansl You think the VERSIONS config an indicator of how many absolute versions of a particular column? I can think of pathological cases where hbase is being used to keep say a queue where client is only interested in most recent cell but many could be writing to the one coordinate. In this case, we'd want to seek to the next column rather than skip, skip, skip, right? Could we do something like the Jesse heuristic where we keep a count and skip the first few but then switch to a seek if it looks like we are on a column of many versions?
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Can't the check the column specifically, only estimate from other columns. Or could seek when we SKIP'ed N times. No longer simple or easy to understand, though.

          There could indeed be a pathetic case with many versions in the memstore (and only in the memstore, since excess versions are pruned upon flush). Could make it configurable.

          Show
          lhofhansl Lars Hofhansl added a comment - Can't the check the column specifically, only estimate from other columns. Or could seek when we SKIP'ed N times. No longer simple or easy to understand, though. There could indeed be a pathetic case with many versions in the memstore (and only in the memstore, since excess versions are pruned upon flush). Could make it configurable.
          Hide
          jesse_yates Jesse Yates added a comment -

          I think we would need to have a stat's mechanism (yes, yes I've been to busy to get that jira in , so we can use decent heurisitics to determine. The other thing Lars mentioned, offline, was using the counts of the other columns as a heurisitic of the number of versions.

          My other thought was updating a counter per-column on INCLUDE (which doesn't get to the checkVersions call) so we can track stats just within the tracker (basic stats keeping that could be generalized).

          Show
          jesse_yates Jesse Yates added a comment - I think we would need to have a stat's mechanism (yes, yes I've been to busy to get that jira in , so we can use decent heurisitics to determine. The other thing Lars mentioned, offline, was using the counts of the other columns as a heurisitic of the number of versions. My other thought was updating a counter per-column on INCLUDE (which doesn't get to the checkVersions call) so we can track stats just within the tracker (basic stats keeping that could be generalized).
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12608804/9778-trunk-v3.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 site. The patch appears to cause mvn site goal to fail.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12608804/9778-trunk-v3.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7571//console This message is automatically generated.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          So this looks good.
          As for the many versions in the memstore, it would be limited to the size of the memstore and only happen when same column is updated very often before the memstore is flushed.
          I'll try to quantify that. (the perf improvement and simplicity would be a shame to give up upon .. )

          Show
          lhofhansl Lars Hofhansl added a comment - So this looks good. As for the many versions in the memstore, it would be limited to the size of the memstore and only happen when same column is updated very often before the memstore is flushed. I'll try to quantify that. (the perf improvement and simplicity would be a shame to give up upon .. )
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Ok... Setup a test with many versions that actually fits into a 256mb memstore:
          200 rows, 5 cols, 1000 versions, 10 bytes values.

          Scan time of one column with patch: 0.22s, without 0.08s. So in this case it is 2.7x slower.
          Lemme think about how I can get both. What I could do is do SKIP... SKIP, until we reached the CF's MAX_VERSIONS, then start issuing SEEK_NEXT_COL. So in the worst case we'd issue MAX_VERSIONS SKIPs too many (and if MAX_VERSIONS=1, we do one superfluous SKIP). Lemme prototype this.

          Show
          lhofhansl Lars Hofhansl added a comment - Ok... Setup a test with many versions that actually fits into a 256mb memstore: 200 rows, 5 cols, 1000 versions, 10 bytes values. Scan time of one column with patch: 0.22s, without 0.08s. So in this case it is 2.7x slower. Lemme think about how I can get both. What I could do is do SKIP... SKIP, until we reached the CF's MAX_VERSIONS, then start issuing SEEK_NEXT_COL. So in the worst case we'd issue MAX_VERSIONS SKIPs too many (and if MAX_VERSIONS=1, we do one superfluous SKIP). Lemme prototype this.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          This gets pretty ugly and inscrutable quickly (if we count column we need to reset at the right points).
          So I am less happy with this idea now.

          Show
          lhofhansl Lars Hofhansl added a comment - This gets pretty ugly and inscrutable quickly (if we count column we need to reset at the right points). So I am less happy with this idea now.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Unscheduling for now.

          Show
          lhofhansl Lars Hofhansl added a comment - Unscheduling for now.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          I'd like to do some brainstorming about how we can make reseek faster if the "reseek distance" is expected to be small. Currently the absolute worst case is when reseek just seeks to the next KV, where next is orders of magnitudes faster.

          Unfortunately we hit this worst case all the time if there is only one version of each KV.

          Show
          lhofhansl Lars Hofhansl added a comment - I'd like to do some brainstorming about how we can make reseek faster if the "reseek distance" is expected to be small. Currently the absolute worst case is when reseek just seeks to the next KV, where next is orders of magnitudes faster. Unfortunately we hit this worst case all the time if there is only one version of each KV.
          Hide
          zjushch chunhui shen added a comment -

          How about if there are hundreds of columns?
          e.g. 10,000 cols per row, explicit columns are col1 and col9999

          Will it be worse with the patch?

          Show
          zjushch chunhui shen added a comment - How about if there are hundreds of columns? e.g. 10,000 cols per row, explicit columns are col1 and col9999 Will it be worse with the patch?
          Hide
          zjushch chunhui shen added a comment -

          As the above comments, I think you have got it.

          Show
          zjushch chunhui shen added a comment - As the above comments, I think you have got it.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          No, it would only be worse if there are many versions. What this avoids is the SEEK_NEXT_COL, which seeks to the next column anticipating an unknown number of versions. If there are only a few versions per column the reseek is more expensive.
          However there is no sure way to know the exact number of versions. Using MAX_VERSIONS as a hint works in all cases but the outliers - for example when a single column is updated frequently, in that case the memstore will be filled with many versions of the same column and we should seek to the next column.

          Your example would shows another current shortcoming, though. If we only wanted to select 1 or 2 columns of the 10000 the tracker would go over the 10000 column and issue 10000 seeks. It should rather seek directly to the next column it cares about.

          Show
          lhofhansl Lars Hofhansl added a comment - No, it would only be worse if there are many versions. What this avoids is the SEEK_NEXT_COL, which seeks to the next column anticipating an unknown number of versions. If there are only a few versions per column the reseek is more expensive. However there is no sure way to know the exact number of versions. Using MAX_VERSIONS as a hint works in all cases but the outliers - for example when a single column is updated frequently, in that case the memstore will be filled with many versions of the same column and we should seek to the next column. Your example would shows another current shortcoming, though. If we only wanted to select 1 or 2 columns of the 10000 the tracker would go over the 10000 column and issue 10000 seeks. It should rather seek directly to the next column it cares about.
          Hide
          lhofhansl Lars Hofhansl added a comment - - edited

          Forget my previous comment, I was wrong. ExplicitColumnTracker does seek to the next column it is interested in, so even with many column this patch would make it worse.

          Show
          lhofhansl Lars Hofhansl added a comment - - edited Forget my previous comment, I was wrong. ExplicitColumnTracker does seek to the next column it is interested in, so even with many column this patch would make it worse.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          I will use Vladimir Rodionov's idea of a scan hint instead. MAX_VERSIONS is not reliable in all cases.

          Show
          lhofhansl Lars Hofhansl added a comment - I will use Vladimir Rodionov 's idea of a scan hint instead. MAX_VERSIONS is not reliable in all cases.
          Hide
          vrodionov Vladimir Rodionov added a comment -

          Lars, you can try two hints: NARROW_COLUMNS and NARROW_ROWS. The latter one supersedes the former one. NARROW_COLUMNS optimizes SEEK_NEXT_COL, NARROW_ROWS - optimizes both: SEEK_NEXT_COL and SEEK_NEXT_ROW.

          Show
          vrodionov Vladimir Rodionov added a comment - Lars, you can try two hints: NARROW_COLUMNS and NARROW_ROWS. The latter one supersedes the former one. NARROW_COLUMNS optimizes SEEK_NEXT_COL, NARROW_ROWS - optimizes both: SEEK_NEXT_COL and SEEK_NEXT_ROW.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          New sample patch. Using NARROW_ROW_HINT to optimize seeking in both ExplicitColumnTracker and ScanWildcardColumnTracker.

          This makes ExplicitColumnTracker go around one for time for a version (like ScanWildcardColumnTracker, but then allows to SKIP following versions if any).

          Show
          lhofhansl Lars Hofhansl added a comment - New sample patch. Using NARROW_ROW_HINT to optimize seeking in both ExplicitColumnTracker and ScanWildcardColumnTracker. This makes ExplicitColumnTracker go around one for time for a version (like ScanWildcardColumnTracker, but then allows to SKIP following versions if any).
          Hide
          lhofhansl Lars Hofhansl added a comment -

          I like the NARROW_COLUMNS (i.e. FEW_VERSIONS) and NARROW_ROWS hinting.

          Show
          lhofhansl Lars Hofhansl added a comment - I like the NARROW_COLUMNS (i.e. FEW_VERSIONS) and NARROW_ROWS hinting.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -
          -    if (count >= maxVersions || (count >= minVersions && isExpired(timestamp))) {
          +    if (count > maxVersions || (count > minVersions && isExpired(timestamp))) {
          

          Mind adding comment for the above change ?

          Show
          yuzhihong@gmail.com Ted Yu added a comment - - if (count >= maxVersions || (count >= minVersions && isExpired(timestamp))) { + if (count > maxVersions || (count > minVersions && isExpired(timestamp))) { Mind adding comment for the above change ?
          Hide
          lhofhansl Lars Hofhansl added a comment -

          I can explain here. A comment in the code would be misplaced and make it more confusing. ScanWildcardColumnTracker does the same.

          The reason is that we can continue to issue SKIPs once we're past the max/min versions. I.e. as long as we're inside the versions range we can issue INCLUDEs, once out we issue SKIPs or SEEK_NEXT_COLs.
          With >= we have to issue a INCLUDE_AND_SEEK_NEXT_COL and never come to back to this column.

          Show
          lhofhansl Lars Hofhansl added a comment - I can explain here. A comment in the code would be misplaced and make it more confusing. ScanWildcardColumnTracker does the same. The reason is that we can continue to issue SKIPs once we're past the max/min versions. I.e. as long as we're inside the versions range we can issue INCLUDEs, once out we issue SKIPs or SEEK_NEXT_COLs. With >= we have to issue a INCLUDE_AND_SEEK_NEXT_COL and never come to back to this column.
          Hide
          lhofhansl Lars Hofhansl added a comment - - edited

          Note that this issue was brought up in HBASE-4433 already (the patch that introduced INCLUDE_AND_SEEK...).

          It seems that for most scenarios we want to undo at least some part of that patch.

          Show
          lhofhansl Lars Hofhansl added a comment - - edited Note that this issue was brought up in HBASE-4433 already (the patch that introduced INCLUDE_AND_SEEK...). It seems that for most scenarios we want to undo at least some part of that patch.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          This needs a bigger discussion. The optimizations put in with HBASE-4433 are counter productive for many use cases.
          The patch there avoids an additional call to next, but does so at the expense of an extra seek (if there aren't many versions). That pays off in the scenario described in HBASE-4433 (large KVs, where an extra next will like lead to loading another block), but with small KVs and few versions, the extra seek is way more expensive than the risk of loading another block.
          (In fact that is exactly the part of the change that Ted requested an extra comment on)

          And, BTW, ScanWildcardQueryMatcher does not have the optimization from HBASE-4433, so this is quite a mess.

          Show
          lhofhansl Lars Hofhansl added a comment - This needs a bigger discussion. The optimizations put in with HBASE-4433 are counter productive for many use cases. The patch there avoids an additional call to next, but does so at the expense of an extra seek (if there aren't many versions). That pays off in the scenario described in HBASE-4433 (large KVs, where an extra next will like lead to loading another block), but with small KVs and few versions, the extra seek is way more expensive than the risk of loading another block. (In fact that is exactly the part of the change that Ted requested an extra comment on) And, BTW, ScanWildcardQueryMatcher does not have the optimization from HBASE-4433 , so this is quite a mess.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Some more numbers with other hardcoded improvements indicate that some Phoenix queries can run over 3x as fast (8.8s instead of 27s).
          The challenge is now to keep the improvements from HBASE-4433 while also improve other scenarios. A new config option is probably not avoidable.

          Show
          lhofhansl Lars Hofhansl added a comment - Some more numbers with other hardcoded improvements indicate that some Phoenix queries can run over 3x as fast (8.8s instead of 27s). The challenge is now to keep the improvements from HBASE-4433 while also improve other scenarios. A new config option is probably not avoidable.
          Hide
          vrodionov Vladimir Rodionov added a comment -

          Good performance increase, indeed. Hinting is not that bad idea. Advanced users (Phoenix) will definitely find a way how to use this optimization.

          Show
          vrodionov Vladimir Rodionov added a comment - Good performance increase, indeed. Hinting is not that bad idea. Advanced users (Phoenix) will definitely find a way how to use this optimization.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          The Phoenix numbers are much better after HBASE-9915. Will redo the tests here, but I would expect (with all the Phoenix defaults) to get 1.5x improvement from the changes here (with is a FAST_DIFF block encoding).

          Show
          lhofhansl Lars Hofhansl added a comment - The Phoenix numbers are much better after HBASE-9915 . Will redo the tests here, but I would expect (with all the Phoenix defaults) to get 1.5x improvement from the changes here (with is a FAST_DIFF block encoding).
          Hide
          apurtell Andrew Purtell added a comment -

          What should we do with this regards 0.98?

          Show
          apurtell Andrew Purtell added a comment - What should we do with this regards 0.98?
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Lemme just unschedule for now.

          Show
          lhofhansl Lars Hofhansl added a comment - Lemme just unschedule for now.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Some further observations.

          When we reseek for a column we pass a KV that would be located just before the first KV for that column, in the various scanners, we then seek forward in the file until we're past the KV passed in, then we go back one KV discarding the current KV. So when we seek forward through the a file we'll scan every KV twice.

          I'm planning to test passing a special KV so that in the scanners can tell when we're on the KV we're looking for. For example when looking for column we can scan forward until we see the first KV for that row, fam, col, and then we can stop. No need to need to scan one more, remember the previous, and then go back. For cases with few versions/columns that should shave off a large portion of the time. Will report back.

          Show
          lhofhansl Lars Hofhansl added a comment - Some further observations. When we reseek for a column we pass a KV that would be located just before the first KV for that column, in the various scanners, we then seek forward in the file until we're past the KV passed in, then we go back one KV discarding the current KV. So when we seek forward through the a file we'll scan every KV twice. I'm planning to test passing a special KV so that in the scanners can tell when we're on the KV we're looking for. For example when looking for column we can scan forward until we see the first KV for that row, fam, col, and then we can stop. No need to need to scan one more, remember the previous, and then go back. For cases with few versions/columns that should shave off a large portion of the time. Will report back.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Jesse Yates,

          Why not keep a count of the versions seen and only skip past if approaching (at?) the number of max versions? Shouldn't need any synchronization on that member variable, so access should be fast

          I found a cheap way of doing this after all. We can keep a counter and reset it every time this.index is changed (i.e. we move on to a new column). That way we can limit the number of skips per selected column. We can even set the number of opportunistic skip per scan operation. Patch coming soon.

          Show
          lhofhansl Lars Hofhansl added a comment - Jesse Yates , Why not keep a count of the versions seen and only skip past if approaching (at?) the number of max versions? Shouldn't need any synchronization on that member variable, so access should be fast I found a cheap way of doing this after all. We can keep a counter and reset it every time this.index is changed (i.e. we move on to a new column). That way we can limit the number of skips per selected column. We can even set the number of opportunistic skip per scan operation. Patch coming soon.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Please have a look.
          This is nice in that it let's a user tune risk of seeking vs. the risk of performing too many next() followed by a seek.

          Somebody please come up with a better name than "eager next".

          Show
          lhofhansl Lars Hofhansl added a comment - Please have a look. This is nice in that it let's a user tune risk of seeking vs. the risk of performing too many next() followed by a seek. Somebody please come up with a better name than "eager next".
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Anoop Sam John, wanna have a look? Related PHOENIX-29.
          Chao Shi, if possible can you run the perf test you did on HBASE-9811 with this patch?
          Vladimir Rodionov, stack, Andrew Purtell, FYI.

          Show
          lhofhansl Lars Hofhansl added a comment - Anoop Sam John , wanna have a look? Related PHOENIX-29 . Chao Shi , if possible can you run the perf test you did on HBASE-9811 with this patch? Vladimir Rodionov , stack , Andrew Purtell , FYI.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          ramkrishna.s.vasudevan, since we are discussing HBASE-10531, this is (somewhat) related.

          Show
          lhofhansl Lars Hofhansl added a comment - ramkrishna.s.vasudevan , since we are discussing HBASE-10531 , this is (somewhat) related.
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          Went thro the patch. the patch looks good. If we have 100 cols and in scan we have added col98, now i think there is no use in using this eager next, same is the case when i have 2 cols but col1 has 99 versions and col2 has one version and scan.addcol we have added col2 only right?
          Mostly when the number of versions are going to be 1 or very min for the column added in scan this could be very much useful. so should we add a comment before this eager_next saying if the column before the one added in scan.addCol has very few versions then this could be used?
          I don't have a better name for 'eager next'.

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - Went thro the patch. the patch looks good. If we have 100 cols and in scan we have added col98, now i think there is no use in using this eager next, same is the case when i have 2 cols but col1 has 99 versions and col2 has one version and scan.addcol we have added col2 only right? Mostly when the number of versions are going to be 1 or very min for the column added in scan this could be very much useful. so should we add a comment before this eager_next saying if the column before the one added in scan.addCol has very few versions then this could be used? I don't have a better name for 'eager next'.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Thanks Ram. Yes on all fronts.

          When we have 100 cols and select col98 only, a seek is more efficient, so we'd do N next()'s for nothing before we seek anyway.

          The main point of this option would be to get good improvement when the next column/version of interest can be reached with a few next()'s while limiting the downside. We can always seek (0), or never seek (MAXINT), and can tune everything in between. I assume most folks would set this somewhere between 5 and 10... The upside of saving a seek a large, the downside of a few extra next()'s is not so bad.

          Many seeks that just skip to the next KV are bad, 1000 next()'s are bad, but 5 next()'s + 1 reseek is not much worse than a reseek.
          Client with more information about their data (such a Phoenix, or certain time series databases, etc) can use that to set a good value here.

          Of course the absolute worst case would be when this is set to 5 and all columns are 6 KVs away from each other (due to versions of intermediary columns). We'd do 5 next()'s and then a seek.

          Lemme see how I can phrase that better in the Javadoc.

          (I had thought about only doing this at the StoreFile level and issue next()'s only when we the key we're looking for falls into the same block, but at the time when we reach the StoreFileScanner we have done most of the work for seek anyway, so it turned out to be not very helpful)

          Show
          lhofhansl Lars Hofhansl added a comment - Thanks Ram. Yes on all fronts. When we have 100 cols and select col98 only, a seek is more efficient, so we'd do N next()'s for nothing before we seek anyway. The main point of this option would be to get good improvement when the next column/version of interest can be reached with a few next()'s while limiting the downside. We can always seek (0), or never seek (MAXINT), and can tune everything in between. I assume most folks would set this somewhere between 5 and 10... The upside of saving a seek a large, the downside of a few extra next()'s is not so bad. Many seeks that just skip to the next KV are bad, 1000 next()'s are bad, but 5 next()'s + 1 reseek is not much worse than a reseek. Client with more information about their data (such a Phoenix, or certain time series databases, etc) can use that to set a good value here. Of course the absolute worst case would be when this is set to 5 and all columns are 6 KVs away from each other (due to versions of intermediary columns). We'd do 5 next()'s and then a seek. Lemme see how I can phrase that better in the Javadoc. (I had thought about only doing this at the StoreFile level and issue next()'s only when we the key we're looking for falls into the same block, but at the time when we reach the StoreFileScanner we have done most of the work for seek anyway, so it turned out to be not very helpful)
          Hide
          apurtell Andrew Purtell added a comment -

          A change that doesn't introduce a perf regression for the default tuning would make sense for 0.98, I agree. Advanced users like Vladimir could retune with the insight they have into their workload.

          Show
          apurtell Andrew Purtell added a comment - A change that doesn't introduce a perf regression for the default tuning would make sense for 0.98, I agree. Advanced users like Vladimir could retune with the insight they have into their workload.
          Hide
          mcorgan Matt Corgan added a comment -

          It would be cool eventually to record block-level statistics inside each block (maxVersions in this case) that can be used by readers. PrefixTree may already have that, but we'd want it for all blocks.

          Show
          mcorgan Matt Corgan added a comment - It would be cool eventually to record block-level statistics inside each block (maxVersions in this case) that can be used by readers. PrefixTree may already have that, but we'd want it for all blocks.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Yeah. Stats would be better than this guessing. You'd need to expose these stats at the StoreScanner level, though. Once at the StoreFileScanner much of the cost to a seek already happened.

          Show
          lhofhansl Lars Hofhansl added a comment - Yeah. Stats would be better than this guessing. You'd need to expose these stats at the StoreScanner level, though. Once at the StoreFileScanner much of the cost to a seek already happened.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Added some tests to 0.94 patch. If approach is cool, I'll make a trunk patch.

          Show
          lhofhansl Lars Hofhansl added a comment - Added some tests to 0.94 patch. If approach is cool, I'll make a trunk patch.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Same for trunk.
          Used the same attribute based approach (should probably protobug it, if we use this as official API).
          I can also see not changing the Scan API at all, and just allow setting that scan attribute - maybe that'd be best anyway.

          Show
          lhofhansl Lars Hofhansl added a comment - Same for trunk. Used the same attribute based approach (should probably protobug it, if we use this as official API). I can also see not changing the Scan API at all, and just allow setting that scan attribute - maybe that'd be best anyway.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          I think I do not want that Scan API changes, will just do it by an attribute. Could probably add a default value via configuration.

          Any input on patch/idea? Worth doing?

          Show
          lhofhansl Lars Hofhansl added a comment - I think I do not want that Scan API changes, will just do it by an attribute. Could probably add a default value via configuration. Any input on patch/idea? Worth doing?
          Hide
          stack stack added a comment -

          +1 if nice release note and a little section in refguide – else this stuff just stays hidden and forgotten about. I think as attribute is good for now.

          Show
          stack stack added a comment - +1 if nice release note and a little section in refguide – else this stuff just stays hidden and forgotten about. I think as attribute is good for now.
          Hide
          anoop.hbase Anoop Sam John added a comment -

          +1 for attribute alone.

          Show
          anoop.hbase Anoop Sam John added a comment - +1 for attribute alone.
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          +1 on patch. Attribute seems good.

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - +1 on patch. Attribute seems good.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Trunk patch with doc changes.

          Show
          lhofhansl Lars Hofhansl added a comment - Trunk patch with doc changes.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Same for 0.94.
          Please have a look at the book changes. If no further comments I'll commit tomorrow.

          Show
          lhofhansl Lars Hofhansl added a comment - Same for 0.94. Please have a look at the book changes. If no further comments I'll commit tomorrow.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          I think a have a better name: HINT_LOOK_AHEAD. That is what this does: The scanner will look ahead a few KeyValues before is does a seek.

          Show
          lhofhansl Lars Hofhansl added a comment - I think a have a better name: HINT_LOOK_AHEAD. That is what this does: The scanner will look ahead a few KeyValues before is does a seek.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          for 0.94 and trunk with LOOK_AHEAD attribute and fixed documentation.

          Show
          lhofhansl Lars Hofhansl added a comment - for 0.94 and trunk with LOOK_AHEAD attribute and fixed documentation.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Documentation/release notes look good?

          Show
          lhofhansl Lars Hofhansl added a comment - Documentation/release notes look good?
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          Looks good. the doc part specifically. +1

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - Looks good. the doc part specifically. +1
          Hide
          lhofhansl Lars Hofhansl added a comment -

          What I am going to commit. (mostly spelling/naming fixes)

          Show
          lhofhansl Lars Hofhansl added a comment - What I am going to commit. (mostly spelling/naming fixes)
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Committed to all branches. Thanks for taking a look.

          Show
          lhofhansl Lars Hofhansl added a comment - Committed to all branches. Thanks for taking a look.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.94-on-Hadoop-2 #46 (See https://builds.apache.org/job/HBase-0.94-on-Hadoop-2/46/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456)

          • /hbase/branches/0.94/src/docbkx/performance.xml
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.94-on-Hadoop-2 #46 (See https://builds.apache.org/job/HBase-0.94-on-Hadoop-2/46/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456) /hbase/branches/0.94/src/docbkx/performance.xml /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.94 #1316 (See https://builds.apache.org/job/HBase-0.94/1316/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456)

          • /hbase/branches/0.94/src/docbkx/performance.xml
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.94 #1316 (See https://builds.apache.org/job/HBase-0.94/1316/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456) /hbase/branches/0.94/src/docbkx/performance.xml /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.94-JDK7 #79 (See https://builds.apache.org/job/HBase-0.94-JDK7/79/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456)

          • /hbase/branches/0.94/src/docbkx/performance.xml
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.94-JDK7 #79 (See https://builds.apache.org/job/HBase-0.94-JDK7/79/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456) /hbase/branches/0.94/src/docbkx/performance.xml /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Whoa, I broke the build. I build HBase, but not the documentation. Sorry. Addendum coming soon.

          Show
          lhofhansl Lars Hofhansl added a comment - Whoa, I broke the build. I build HBase, but not the documentation. Sorry. Addendum coming soon.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          0.94 only, an extra </section> sneaked in there somehow. Committed addendum.

          Show
          lhofhansl Lars Hofhansl added a comment - 0.94 only, an extra </section> sneaked in there somehow. Committed addendum.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in HBase-0.94-security #437 (See https://builds.apache.org/job/HBase-0.94-security/437/)
          HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491)

          • /hbase/branches/0.94/src/docbkx/performance.xml
            HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456)
          • /hbase/branches/0.94/src/docbkx/performance.xml
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in HBase-0.94-security #437 (See https://builds.apache.org/job/HBase-0.94-security/437/ ) HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491) /hbase/branches/0.94/src/docbkx/performance.xml HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456) /hbase/branches/0.94/src/docbkx/performance.xml /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.94 #1317 (See https://builds.apache.org/job/HBase-0.94/1317/)
          HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491)

          • /hbase/branches/0.94/src/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.94 #1317 (See https://builds.apache.org/job/HBase-0.94/1317/ ) HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491) /hbase/branches/0.94/src/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.94-JDK7 #80 (See https://builds.apache.org/job/HBase-0.94-JDK7/80/)
          HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491)

          • /hbase/branches/0.94/src/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.94-JDK7 #80 (See https://builds.apache.org/job/HBase-0.94-JDK7/80/ ) HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491) /hbase/branches/0.94/src/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #206 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/206/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576461)

          • /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          • /hbase/branches/0.98/src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #206 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/206/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576461) /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java /hbase/branches/0.98/src/main/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.98 #220 (See https://builds.apache.org/job/HBase-0.98/220/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576461)

          • /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          • /hbase/branches/0.98/src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.98 #220 (See https://builds.apache.org/job/HBase-0.98/220/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576461) /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java /hbase/branches/0.98/src/main/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in hbase-0.96 #339 (See https://builds.apache.org/job/hbase-0.96/339/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576462)

          • /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          • /hbase/branches/0.96/src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in hbase-0.96 #339 (See https://builds.apache.org/job/hbase-0.96/339/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576462) /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java /hbase/branches/0.96/src/main/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.94-on-Hadoop-2 #47 (See https://builds.apache.org/job/HBase-0.94-on-Hadoop-2/47/)
          HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491)

          • /hbase/branches/0.94/src/docbkx/performance.xml
            HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456)
          • /hbase/branches/0.94/src/docbkx/performance.xml
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.94-on-Hadoop-2 #47 (See https://builds.apache.org/job/HBase-0.94-on-Hadoop-2/47/ ) HBASE-9778 Addendum; fix documentation markup (larsh: rev 1576491) /hbase/branches/0.94/src/docbkx/performance.xml HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576456) /hbase/branches/0.94/src/docbkx/performance.xml /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in hbase-0.96-hadoop2 #235 (See https://builds.apache.org/job/hbase-0.96-hadoop2/235/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576462)

          • /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          • /hbase/branches/0.96/src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in hbase-0.96-hadoop2 #235 (See https://builds.apache.org/job/hbase-0.96-hadoop2/235/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576462) /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java /hbase/branches/0.96/src/main/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #4999 (See https://builds.apache.org/job/HBase-TRUNK/4999/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576457)

          • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          • /hbase/trunk/src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4999 (See https://builds.apache.org/job/HBase-TRUNK/4999/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576457) /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java /hbase/trunk/src/main/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #113 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/113/)
          HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576457)

          • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          • /hbase/trunk/src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #113 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/113/ ) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576457) /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java /hbase/trunk/src/main/docbkx/performance.xml

            People

            • Assignee:
              lhofhansl Lars Hofhansl
              Reporter:
              lhofhansl Lars Hofhansl
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development