Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.92.0
    • Fix Version/s: 0.94.0
    • Component/s: regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions.
      However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241).
      There should be a way to retain those version to guard against software error.

      I see two options here:
      1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
      2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker.

      #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint)

      Comments? Any other options?

      1. 4536-v15.txt
        90 kB
        Lars Hofhansl
      2. 4536-v16.txt
        96 kB
        Lars Hofhansl

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker

          This would produce fewer opportunities for user surprise.

          Show
          Andrew Purtell added a comment - 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker This would produce fewer opportunities for user surprise.
          Hide
          Jonathan Gray added a comment -

          We were also discussion today that there are situations (especially in multi-master situations) where you want to retain the delete markers for some period of time as well.

          I would think this would require both a family-level setting (a new one or the existing one) and also a read-time option, correct? As of now, deletes are never returned to the client. You'd have to return them in this case otherwise the user would have no idea what is actually there? I'm not sure it's fair to ask a user to understand how our delete tombstones work

          Show
          Jonathan Gray added a comment - We were also discussion today that there are situations (especially in multi-master situations) where you want to retain the delete markers for some period of time as well. I would think this would require both a family-level setting (a new one or the existing one) and also a read-time option, correct? As of now, deletes are never returned to the client. You'd have to return them in this case otherwise the user would have no idea what is actually there? I'm not sure it's fair to ask a user to understand how our delete tombstones work
          Hide
          Lars Hofhansl added a comment -

          @Andrew: Agree. There's no point really in setting up #minimumVersion if they get blown away by a delete marker... Could even argue it's a bug in the parent.

          @Jon: Replication already replicates deletes. And (luckily ) it has no optimization to skip versions once it found a delete marker. So I think that should already work.
          Or are you referring to some other method of multi-master replication. In that case we'd have invent a new way to let a client retrieve delete markers. I think that'd be a different jira.

          Show
          Lars Hofhansl added a comment - @Andrew: Agree. There's no point really in setting up #minimumVersion if they get blown away by a delete marker... Could even argue it's a bug in the parent. @Jon: Replication already replicates deletes. And (luckily ) it has no optimization to skip versions once it found a delete marker. So I think that should already work. Or are you referring to some other method of multi-master replication. In that case we'd have invent a new way to let a client retrieve delete markers. I think that'd be a different jira.
          Hide
          Lars Hofhansl added a comment -

          Started to look at the code.

          Scanning now has three different scenarios:

          1. Normal scan. Under all circumstanced filter deleted rows (unless it's a past scan, of course). Do not return delete markers.
          2. Minor compaction. If min versions is not set, follow current behavior (filter deleted rows, keep delete markers). If min versions is set, keep deleted rows, treat delete markers as normal rows (but do not count them as versions). I.e. delete markers are subject to maxversions, expiration, etc.
          3. Major compaction. If min versions is not set, follow current behavior (filter deleted rows, remove delete markers, like #1). If min versions is set, keep deleted rows, treat delete markers as normal rows (again, do not count them as versions).

          In all three cases rows past max versions or expired rows past min versions will be filtered out.

          Jon could reuse scan type #2 or #3 for his needs, I think.

          Show
          Lars Hofhansl added a comment - Started to look at the code. Scanning now has three different scenarios: Normal scan. Under all circumstanced filter deleted rows (unless it's a past scan, of course). Do not return delete markers. Minor compaction. If min versions is not set, follow current behavior (filter deleted rows, keep delete markers). If min versions is set, keep deleted rows, treat delete markers as normal rows (but do not count them as versions). I.e. delete markers are subject to maxversions, expiration, etc. Major compaction. If min versions is not set, follow current behavior (filter deleted rows, remove delete markers, like #1). If min versions is set, keep deleted rows, treat delete markers as normal rows (again, do not count them as versions). In all three cases rows past max versions or expired rows past min versions will be filtered out. Jon could reuse scan type #2 or #3 for his needs, I think.
          Hide
          Lars Hofhansl added a comment -

          There's a fourth scan type:
          4. Memstore flush (see HBASE-4241). This behaves exactly like #2.

          There is a pathetic scenario where CF has minversions set and a client only write delete markers. As they are not counted as versions, they would never be removed (not even in a major compaction).
          I suppose to guard against that case we could remove all trailing delete markers (that are not followed by any surviving kvs).

          Show
          Lars Hofhansl added a comment - There's a fourth scan type: 4. Memstore flush (see HBASE-4241 ). This behaves exactly like #2. There is a pathetic scenario where CF has minversions set and a client only write delete markers. As they are not counted as versions, they would never be removed (not even in a major compaction). I suppose to guard against that case we could remove all trailing delete markers (that are not followed by any surviving kvs).
          Hide
          Lars Hofhansl added a comment -

          Turns out that currently it is not at all possible to query (get or scan) any rows "behind" a delete marker, even setting a timerange on a Get that is before the delete marker will not return any rows.

          Show
          Lars Hofhansl added a comment - Turns out that currently it is not at all possible to query (get or scan) any rows "behind" a delete marker, even setting a timerange on a Get that is before the delete marker will not return any rows.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

          This changes the following:
          1. MinVersions now also means: Keep this number of version even when the row was deleted.
          2. Allow gets/scans to retrieve rows hidden by a delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.

          The change is pretty small, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- Review request for hbase. Summary ------- In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T). This changes the following: 1. MinVersions now also means: Keep this number of version even when the row was deleted. 2. Allow gets/scans to retrieve rows hidden by a delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. The change is pretty small, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891 Diff: https://reviews.apache.org/r/2178/diff Testing ------- TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2323
          -----------------------------------------------------------

          Nice feature.
          Should a new config parameter be introduced ?

          • Ted

          On 2011-10-04 21:27:20, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-04 21:27:20)

          Review request for hbase.

          Summary

          -------

          In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

          This changes the following:

          1. MinVersions now also means: Keep this number of version even when the row was deleted.

          2. Allow gets/scans to retrieve rows hidden by a delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          The change is pretty small, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2323 ----------------------------------------------------------- Nice feature. Should a new config parameter be introduced ? Ted On 2011-10-04 21:27:20, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-04 21:27:20) Review request for hbase. Summary ------- In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T). This changes the following: 1. MinVersions now also means: Keep this number of version even when the row was deleted. 2. Allow gets/scans to retrieve rows hidden by a delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. The change is pretty small, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891 Diff: https://reviews.apache.org/r/2178/diff Testing ------- TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback. Thanks, Lars
          Hide
          Lars Hofhansl added a comment -

          Re: "New config param". Wasn't sure in the beginning, but now I think this should have really been part of the HBASE-4071.
          With config param you mean a new flag on HColumnDescriptor? That would provide for more flexibility, the question is whether it is needed, or would just confuse any users (the minversion feature is confusing as is ).
          I'm not opposed to the separate option for this, though.

          Show
          Lars Hofhansl added a comment - Re: "New config param". Wasn't sure in the beginning, but now I think this should have really been part of the HBASE-4071 . With config param you mean a new flag on HColumnDescriptor? That would provide for more flexibility, the question is whether it is needed, or would just confuse any users (the minversion feature is confusing as is ). I'm not opposed to the separate option for this, though.
          Hide
          Ted Yu added a comment -

          Or maybe this feature should latch onto minversion being > 0 so that we don't introduce additional parameter ?

          Show
          Ted Yu added a comment - Or maybe this feature should latch onto minversion being > 0 so that we don't introduce additional parameter ?
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-05 01:06:33.205905)

          Review request for hbase.

          Changes
          -------

          Version that actually compiles.

          Summary
          -------

          In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

          This changes the following:
          1. MinVersions now also means: Keep this number of version even when the row was deleted.
          2. Allow gets/scans to retrieve rows hidden by a delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.

          The change is pretty small, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-05 01:06:33.205905) Review request for hbase. Changes ------- Version that actually compiles. Summary ------- In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T). This changes the following: 1. MinVersions now also means: Keep this number of version even when the row was deleted. 2. Allow gets/scans to retrieve rows hidden by a delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. The change is pretty small, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891 Diff: https://reviews.apache.org/r/2178/diff Testing ------- TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback. Thanks, Lars
          Hide
          Lars Hofhansl added a comment -

          @Ted, yep... That's what the parent is

          Show
          Lars Hofhansl added a comment - @Ted, yep... That's what the parent is
          Hide
          Lars Hofhansl added a comment -

          This causes a slight incompatible change.
          Queries in the past before a delete marker will return rows before he marker (this is as it should be IMHO, but it is a change from the current behavior).

          Show
          Lars Hofhansl added a comment - This causes a slight incompatible change. Queries in the past before a delete marker will return rows before he marker (this is as it should be IMHO, but it is a change from the current behavior).
          Hide
          Jonathan Gray added a comment -

          This changes default behavior now? I disagree that expected behavior is to ever uncover previously deleted data. I'm okay with this as an option.

          Show
          Jonathan Gray added a comment - This changes default behavior now? I disagree that expected behavior is to ever uncover previously deleted data. I'm okay with this as an option.
          Hide
          Lars Hofhansl added a comment -

          You are right.

          I have a new version of the patch now that only does that when minimum versions is enabled for the CF. As minversions is not released, yet, this should be ok.

          Show
          Lars Hofhansl added a comment - You are right. I have a new version of the patch now that only does that when minimum versions is enabled for the CF. As minversions is not released, yet, this should be ok.
          Hide
          Lars Hofhansl added a comment -

          Now version of patch does not have the incompatible behavior.

          Show
          Lars Hofhansl added a comment - Now version of patch does not have the incompatible behavior.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-05 18:16:19.303065)

          Review request for hbase.

          Changes
          -------

          All tests pass now (except TestAdmin, which fails with or without my changes).

          Summary
          -------

          In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

          This changes the following:
          1. MinVersions now also means: Keep this number of version even when the row was deleted.
          2. Allow gets/scans to retrieve rows hidden by a delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.

          The change is pretty small, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-05 18:16:19.303065) Review request for hbase. Changes ------- All tests pass now (except TestAdmin, which fails with or without my changes). Summary ------- In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T). This changes the following: 1. MinVersions now also means: Keep this number of version even when the row was deleted. 2. Allow gets/scans to retrieve rows hidden by a delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. The change is pretty small, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891 Diff: https://reviews.apache.org/r/2178/diff Testing ------- TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-05 18:17:40.058774)

          Review request for hbase.

          Summary (updated)
          -------

          In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

          This changes the following:
          1. MinVersions now also means: Keep this number of version even when the row was deleted.
          2. Allow gets/scans to retrieve rows hidden by a delete marker (only if minVersions is set for the CF).
          3. Do not unconditionally collect all deleted rows during a compaction.

          The change is pretty small, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891

          Diff: https://reviews.apache.org/r/2178/diff

          Testing (updated)
          -------

          All test (except TestAdmin, which fail with or without my change) pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-05 18:17:40.058774) Review request for hbase. Summary (updated) ------- In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T). This changes the following: 1. MinVersions now also means: Keep this number of version even when the row was deleted. 2. Allow gets/scans to retrieve rows hidden by a delete marker (only if minVersions is set for the CF). 3. Do not unconditionally collect all deleted rows during a compaction. The change is pretty small, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891 Diff: https://reviews.apache.org/r/2178/diff Testing (updated) ------- All test (except TestAdmin, which fail with or without my change) pass now. Thanks, Lars
          Hide
          Lars Hofhansl added a comment -

          Thinking about this a bit more. I think this would better be covered by a separate flag.

          The reason is this: Users might set TTL for a store in order to recreate any state of the store within that TTL. This does not currently work deleted rows.
          MinVersins is somewhat orthogonal to this.
          If KEEP_DELETED_ROWS (or however we want to call this) is set on a CF, it would just mean that deleted rows and the delete markers are subject to the usual MaxVersions, TTL/MinVersions rules.

          Do people think this is not a good idea generally, and this should be rather handled by the application - by simply not deleting anything? (there's fairly little discussion here)

          Show
          Lars Hofhansl added a comment - Thinking about this a bit more. I think this would better be covered by a separate flag. The reason is this: Users might set TTL for a store in order to recreate any state of the store within that TTL. This does not currently work deleted rows. MinVersins is somewhat orthogonal to this. If KEEP_DELETED_ROWS (or however we want to call this) is set on a CF, it would just mean that deleted rows and the delete markers are subject to the usual MaxVersions, TTL/MinVersions rules. Do people think this is not a good idea generally, and this should be rather handled by the application - by simply not deleting anything? (there's fairly little discussion here)
          Hide
          Ted Yu added a comment -

          Introducing new flag would allow maximum flexibility.
          For column family where the flag isn't specified, we can handle deleted rows and the delete markers according to the presence of minVersions.

          Show
          Ted Yu added a comment - Introducing new flag would allow maximum flexibility. For column family where the flag isn't specified, we can handle deleted rows and the delete markers according to the presence of minVersions.
          Hide
          Lars Hofhansl added a comment -

          That might be getting hard to understand. minVersions would have slightly different meaning depending on whether that extra flag is set. Without the flag minVersions is like "maxVersions for deleted rows", not sure who would need that.

          Having just the flag for deleted rows would also make the code easier to follow as the column tracker would no longer need to distinguish between "normal" rows, delete markers, and deleted rows as it does in the current patch; but only between rows (deleted or not) and delete markers.

          Show
          Lars Hofhansl added a comment - That might be getting hard to understand. minVersions would have slightly different meaning depending on whether that extra flag is set. Without the flag minVersions is like "maxVersions for deleted rows", not sure who would need that. Having just the flag for deleted rows would also make the code easier to follow as the column tracker would no longer need to distinguish between "normal" rows, delete markers, and deleted rows as it does in the current patch; but only between rows (deleted or not) and delete markers.
          Hide
          Lars Hofhansl added a comment -

          Turns out this is a bit more complicated than I thought. There are three types of deletes:

          1. version deletes - effective for a specific version of a specific column
          2. column deletes - effective for all versions of a specific column
          3. family deletes - effective for all versions of all columns of a family

          The first two are sorted before the puts they affect based on their resp. timestamps, but after newer puts.
          Family deletes, always sort before all versions of all columns.

          The problems is deciding when the delete rows (the marker rows) themselves can be removed during a major compaction.

          For #1 and #2 I can just do version counting, and newer puts will eventually push out the delete markers from the store.
          With #3 this will never happen as they always sort before all puts of the same family, regardless of any timestamp set on them.
          Here it is necessary to scan all puts for that family and then decide whether the delete needs to be included based on whether the delete had any affect on any of the puts in the same family.

          Because of this, moving out of 0.92 as changes will be bigger. Put back if you think otherwise.

          I still think that timetravel is an important feature of HBase and incomplete if it cannot include deleted rows.

          Show
          Lars Hofhansl added a comment - Turns out this is a bit more complicated than I thought. There are three types of deletes: version deletes - effective for a specific version of a specific column column deletes - effective for all versions of a specific column family deletes - effective for all versions of all columns of a family The first two are sorted before the puts they affect based on their resp. timestamps, but after newer puts. Family deletes, always sort before all versions of all columns. The problems is deciding when the delete rows (the marker rows) themselves can be removed during a major compaction. For #1 and #2 I can just do version counting, and newer puts will eventually push out the delete markers from the store. With #3 this will never happen as they always sort before all puts of the same family, regardless of any timestamp set on them. Here it is necessary to scan all puts for that family and then decide whether the delete needs to be included based on whether the delete had any affect on any of the puts in the same family. Because of this, moving out of 0.92 as changes will be bigger. Put back if you think otherwise. I still think that timetravel is an important feature of HBase and incomplete if it cannot include deleted rows.
          Hide
          Jonathan Gray added a comment -

          Lars, I agree that this is an important feature. Also agree that we should take time and do it right and not push for 0.92.

          Could we just support some kind of "raw scanner" along with a TTKAKV config (Time To Keep All Key Values)?

          Show
          Jonathan Gray added a comment - Lars, I agree that this is an important feature. Also agree that we should take time and do it right and not push for 0.92. Could we just support some kind of "raw scanner" along with a TTKAKV config (Time To Keep All Key Values)?
          Hide
          Lars Hofhansl added a comment -

          A simple option would be to only allow the "KEEP_DELETED" flag set when TTL is also set.
          Then we'd do simple version counting for #1 and #2 type deletes and rely on TTL to expire #3.
          (That means you could have more #3 delete markers than max versions, which would also be the case with TTKAKV. That might be acceptable).

          Show
          Lars Hofhansl added a comment - A simple option would be to only allow the "KEEP_DELETED" flag set when TTL is also set. Then we'd do simple version counting for #1 and #2 type deletes and rely on TTL to expire #3. (That means you could have more #3 delete markers than max versions, which would also be the case with TTKAKV. That might be acceptable).
          Hide
          Lars Hofhansl added a comment -

          Another option for the problem above is, to not write the family delete marker during compaction but to instead translate them to column delete marker as we scan through the kvs following a family delete marker.

          Show
          Lars Hofhansl added a comment - Another option for the problem above is, to not write the family delete marker during compaction but to instead translate them to column delete marker as we scan through the kvs following a family delete marker.
          Hide
          Ted Yu added a comment -

          If the number of columns for the underlying family is not huge, the option of translating family delete marker is attractive.

          Show
          Ted Yu added a comment - If the number of columns for the underlying family is not huge, the option of translating family delete marker is attractive.
          Hide
          Lars Hofhansl added a comment -

          Yet another option is to take the smallest time of any of the store files and remove the family markers if they are older than that. The marker may survive two compactions in that case, but eventually they'll be removed.

          In addition to address Jon's need, I think we can add a raw flag to the Scan object. If true, the scan will retrieve all available rows including deleted rows and delete markers. With the rest of the changes from this patch, that would be really easy to do.
          (I assume I'd have to increment the SCAN_VERSION, correct?)

          Show
          Lars Hofhansl added a comment - Yet another option is to take the smallest time of any of the store files and remove the family markers if they are older than that. The marker may survive two compactions in that case, but eventually they'll be removed. In addition to address Jon's need, I think we can add a raw flag to the Scan object. If true, the scan will retrieve all available rows including deleted rows and delete markers. With the rest of the changes from this patch, that would be really easy to do. (I assume I'd have to increment the SCAN_VERSION, correct?)
          Hide
          Lars Hofhansl added a comment -

          This issue is pretty resistant to all my attempts to solve it.

          Was toying with keeping track of the first key written to a file (we currently keep track of the last key).
          However, since the family delete hides all puts with a TS less than or equal to the delete TS, this won't help as the delete marker itself eventually would be the first key and it still might affect puts later in the file.

          Also KVs are sorted in reverse time order (except family delete markers), so I cannot just look at the put directly following the delete marker, because it'll be the newest put rather then the oldest.

          So there are three options I think:
          1. Only allow the new flag set on CFs with TTL set. MIN_VERSIONS would not apply to deleted rows or delete marker rows (wouldn't know how long to keep family deletes in that case). (MAX)VERSIONS would still be enforced on all rows types except for family delete markers.
          2. Translate family delete markers to column delete marker at (major) compaction time.
          3. Change HFileWriterV* to keep track of the earliest put TS in a store and write it to the file metadata. Use that use expire delete marker that are older and hence can't affect any puts in the file.

          None of these are particularly attractive. #1 is limiting, #2 might get expensive if there're many columns, #3 would require the FileWriters to understand KVs.

          Show
          Lars Hofhansl added a comment - This issue is pretty resistant to all my attempts to solve it. Was toying with keeping track of the first key written to a file (we currently keep track of the last key). However, since the family delete hides all puts with a TS less than or equal to the delete TS, this won't help as the delete marker itself eventually would be the first key and it still might affect puts later in the file. Also KVs are sorted in reverse time order (except family delete markers), so I cannot just look at the put directly following the delete marker, because it'll be the newest put rather then the oldest. So there are three options I think: 1. Only allow the new flag set on CFs with TTL set. MIN_VERSIONS would not apply to deleted rows or delete marker rows (wouldn't know how long to keep family deletes in that case). (MAX)VERSIONS would still be enforced on all rows types except for family delete markers. 2. Translate family delete markers to column delete marker at (major) compaction time. 3. Change HFileWriterV* to keep track of the earliest put TS in a store and write it to the file metadata. Use that use expire delete marker that are older and hence can't affect any puts in the file. None of these are particularly attractive. #1 is limiting, #2 might get expensive if there're many columns, #3 would require the FileWriters to understand KVs.
          Hide
          Lars Hofhansl added a comment -

          and
          3a. Have Store.java keep track of the earliest put in internalFlushCache and compactStore and then append it to the file metadata. That way HFileWriterV* would not need to know about KVs.

          Show
          Lars Hofhansl added a comment - and 3a. Have Store.java keep track of the earliest put in internalFlushCache and compactStore and then append it to the file metadata. That way HFileWriterV* would not need to know about KVs.
          Hide
          Ted Yu added a comment -

          3a stands out as first choice so far.

          Show
          Ted Yu added a comment - 3a stands out as first choice so far.
          Hide
          Lars Hofhansl added a comment -

          That's what I concluded shortly after I posted that.
          I made the changes, also added a "raw" option to Scan. All tests pass, including a bunch of new tests.

          Show
          Lars Hofhansl added a comment - That's what I concluded shortly after I posted that. I made the changes, also added a "raw" option to Scan. All tests pass, including a bunch of new tests.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-11 04:02:54.234402)

          Review request for hbase.

          Changes
          -------

          Added a specific flag for HColumnFamilty to enable this.
          Family delete markers are now also collected. The way that is done is a bit more complicated than I had hoped.
          o StoreFile.writer keeps track of the lowest timestamp of any Put seen and stores in the files metadata.
          o Upon major compaction the lowest of all put timestamps of all involved storefiles is passed to the StoreScanner, which can that remove any family delete older than that timestamp.

          Tests:
          ...
          Results :

          Tests run: 1033, Failures: 0, Errors: 0, Skipped: 16

          Summary (updated)
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181286
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181286

          Diff: https://reviews.apache.org/r/2178/diff

          Testing (updated)
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-11 04:02:54.234402) Review request for hbase. Changes ------- Added a specific flag for HColumnFamilty to enable this. Family delete markers are now also collected. The way that is done is a bit more complicated than I had hoped. o StoreFile.writer keeps track of the lowest timestamp of any Put seen and stores in the files metadata. o Upon major compaction the lowest of all put timestamps of all involved storefiles is passed to the StoreScanner, which can that remove any family delete older than that timestamp. Tests: ... Results : Tests run: 1033, Failures: 0, Errors: 0, Skipped: 16 Summary (updated) ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181286 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181286 Diff: https://reviews.apache.org/r/2178/diff Testing (updated) ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-11 04:21:07.770408)

          Review request for hbase.

          Changes
          -------

          Addressing anticipated nits

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-11 04:21:07.770408) Review request for hbase. Changes ------- Addressing anticipated nits Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2501
          -----------------------------------------------------------

          Elegant solution.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <https://reviews.apache.org/r/2178/#comment5660>

          Is this changed required by the feature ?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          <https://reviews.apache.org/r/2178/#comment5661>

          'the' is redundant.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5662>

          Timerange would be better than timetravel.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          <https://reviews.apache.org/r/2178/#comment5663>

          Would EARLIEST_PUT_TS be a better name ?

          • Ted

          On 2011-10-11 04:21:07, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-11 04:21:07)

          Review request for hbase.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2501 ----------------------------------------------------------- Elegant solution. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2178/#comment5660 > Is this changed required by the feature ? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java < https://reviews.apache.org/r/2178/#comment5661 > 'the' is redundant. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5662 > Timerange would be better than timetravel. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java < https://reviews.apache.org/r/2178/#comment5663 > Would EARLIEST_PUT_TS be a better name ? Ted On 2011-10-11 04:21:07, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-11 04:21:07) Review request for hbase. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-10-11 06:02:38, Ted Yu wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 106

          > <https://reviews.apache.org/r/2178/diff/6/?file=49393#file49393line106>

          >

          > Timerange would be better than timetravel.

          Agreed.

          On 2011-10-11 06:02:38, Ted Yu wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 321

          > <https://reviews.apache.org/r/2178/diff/6/?file=49389#file49389line321>

          >

          > Is this changed required by the feature ?

          No... But it didn't make sense before. Setting minVersions equal to maxVersions is the same as not having TTL.

          On 2011-10-11 06:02:38, Ted Yu wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 127

          > <https://reviews.apache.org/r/2178/diff/6/?file=49396#file49396line127>

          >

          > Would EARLIEST_PUT_TS be a better name ?

          Probably.

          • Lars

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2501
          -----------------------------------------------------------

          On 2011-10-11 04:21:07, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-11 04:21:07)

          Review request for hbase.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-10-11 06:02:38, Ted Yu wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 106 > < https://reviews.apache.org/r/2178/diff/6/?file=49393#file49393line106 > > > Timerange would be better than timetravel. Agreed. On 2011-10-11 06:02:38, Ted Yu wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 321 > < https://reviews.apache.org/r/2178/diff/6/?file=49389#file49389line321 > > > Is this changed required by the feature ? No... But it didn't make sense before. Setting minVersions equal to maxVersions is the same as not having TTL. On 2011-10-11 06:02:38, Ted Yu wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java , line 127 > < https://reviews.apache.org/r/2178/diff/6/?file=49396#file49396line127 > > > Would EARLIEST_PUT_TS be a better name ? Probably. Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2501 ----------------------------------------------------------- On 2011-10-11 04:21:07, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-11 04:21:07) Review request for hbase. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-11 06:20:01.186435)

          Review request for hbase.

          Changes
          -------

          o Addressed Ted's point.
          o Also added this flag to the shell.

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-11 06:20:01.186435) Review request for hbase. Changes ------- o Addressed Ted's point. o Also added this flag to the shell. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2503
          -----------------------------------------------------------

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <https://reviews.apache.org/r/2178/#comment5667>

          So minVersions == maxVersions == 1 would no longer be accepted ...

          • Ted

          On 2011-10-11 06:20:01, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-11 06:20:01)

          Review request for hbase.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2503 ----------------------------------------------------------- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2178/#comment5667 > So minVersions == maxVersions == 1 would no longer be accepted ... Ted On 2011-10-11 06:20:01, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-11 06:20:01) Review request for hbase. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-10-11 06:21:39, Ted Yu wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 321

          > <https://reviews.apache.org/r/2178/diff/6/?file=49389#file49389line321>

          >

          > So minVersions == maxVersions == 1 would no longer be accepted ...

          Correct. TTL<FOREVER + maxVersions=1 + minVersion=1 would be identical to just setting maxVersions=1 (without TTL or minVersions)

          • Lars

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2503
          -----------------------------------------------------------

          On 2011-10-11 06:20:01, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-11 06:20:01)

          Review request for hbase.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-10-11 06:21:39, Ted Yu wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 321 > < https://reviews.apache.org/r/2178/diff/6/?file=49389#file49389line321 > > > So minVersions == maxVersions == 1 would no longer be accepted ... Correct. TTL<FOREVER + maxVersions=1 + minVersion=1 would be identical to just setting maxVersions=1 (without TTL or minVersions) Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2503 ----------------------------------------------------------- On 2011-10-11 06:20:01, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-11 06:20:01) Review request for hbase. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          Ted Yu added a comment -

          For the case where minVersions == maxVersions == n, since we know the semantics, shall we change the values for minVersions and log our decision instead of throwing IllegalArgumentException ?

          Show
          Ted Yu added a comment - For the case where minVersions == maxVersions == n, since we know the semantics, shall we change the values for minVersions and log our decision instead of throwing IllegalArgumentException ?
          Hide
          Lars Hofhansl added a comment -

          You mean set TTL to FOREVER and minVersions to 0? Sure that would work.

          On the other hand it indicates that the user did not understand how minVersions works, so I could see an error as well.

          Show
          Lars Hofhansl added a comment - You mean set TTL to FOREVER and minVersions to 0? Sure that would work. On the other hand it indicates that the user did not understand how minVersions works, so I could see an error as well.
          Hide
          Ted Yu added a comment -

          Personally I would love the software which can interpret my expression correctly even though I don't fully know how it works
          It's your call what to do here, Lars.

          Show
          Ted Yu added a comment - Personally I would love the software which can interpret my expression correctly even though I don't fully know how it works It's your call what to do here, Lars.
          Hide
          Lars Hofhansl added a comment -

          I think I'd like to keep the error, to avoid surprises.

          And some more eyes should look at this...
          @jgray, @stack wanna have a look too?

          Show
          Lars Hofhansl added a comment - I think I'd like to keep the error, to avoid surprises. And some more eyes should look at this... @jgray, @stack wanna have a look too?
          Hide
          Lars Hofhansl added a comment -

          Converted to feature.

          Show
          Lars Hofhansl added a comment - Converted to feature.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-12 04:55:53.069011)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-12 04:55:53.069011) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          stack added a comment -

          Shouldn't this facility be on by default?

          Show
          stack added a comment - Shouldn't this facility be on by default?
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2609
          -----------------------------------------------------------

          Great stuff.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <https://reviews.apache.org/r/2178/#comment5854>

          Javadoc what the new param means

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <https://reviews.apache.org/r/2178/#comment5855>

          So you made this change to make it so user doesn't have condition where they are scratching their head trying to figure why something is not being let go though TTL is four hours?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <https://reviews.apache.org/r/2178/#comment5856>

          Is this value read often? If so you might want to cache it. Profiling these things can sometimes show up as costing.

          Not important but if you get to make new patch do

          if (test) block

          or

          if (test)

          { block }

          rather than the

          if (test)
          block

          you have.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          <https://reviews.apache.org/r/2178/#comment5857>

          Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          <https://reviews.apache.org/r/2178/#comment5858>

          Good.

          There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan.

          Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course).

          I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions?

          How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having:

          maxversions... delete marker .... maxversions ... delete marker .... maxversions...

          and so on if this facility is enabled?

          I won't have:

          as many versions as was written but we only return maxversions......delete marker...... as many versions as were written...... delete marker......

          Is that right? (If you follow me).

          Say 'garbage collected' in the comment rather than just 'collected'

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
          <https://reviews.apache.org/r/2178/#comment5859>

          Bad name for a param. This is name you'd give the getter for a param named 'delete'.

          What is this param doing? Is this the raw attribute or is it from HCD saying keep deletes? Make it clear in javadoc.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          <https://reviews.apache.org/r/2178/#comment5860>

          good

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          <https://reviews.apache.org/r/2178/#comment5861>

          good

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5862>

          This stuff needs to make it out into the hbase book. Good stuff.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5863>

          So when I am doing a check for max versions = 3 and I have for a single cell:

          put ... put....delete....put

          If I keep delete markers and do a raw scan, I'll get: p p d

          If I have keep delete markers but not a raw scan I get: p p

          If I do not have keep delete markers and its not an ordinary scan: p p

          Something like that?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5864>

          I'm not sure I get what this one is about.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5866>

          An 'of' should be an 'if' here?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5867>

          These params on an SQM are starting to build up.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5868>

          I don't understand whats up here. Why we changing timerange if seePastDeleteMarker is set?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5869>

          Is 'keepDeletedRows' a bad name for this variable? Should it be keepDeletedCells?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5870>

          So, this is a delete kv at this stage of the processing? So, if its older than any put in any storefile on major compactions, we can let it go since has no effect? Thats good.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment5871>

          This makes me nervous. All old tests pass on this code? You have a unit test to test that we are doing the right thing in here both for the old and the new handling?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
          <https://reviews.apache.org/r/2178/#comment5872>

          Again, bad name for param.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
          <https://reviews.apache.org/r/2178/#comment5873>

          Is this right? We should count it as a version if keepdeletes is on? Unless this flag is saying 'keepdeletes' and not 'this is a delete kv' as I'm interpreting it).

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          <https://reviews.apache.org/r/2178/#comment5874>

          No biggie... previous formatting was better.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          <https://reviews.apache.org/r/2178/#comment5875>

          This belongs in HCD rather than up here in this global HConstants?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
          <https://reviews.apache.org/r/2178/#comment5876>

          Why we move this to the end?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb
          <https://reviews.apache.org/r/2178/#comment5877>

          Very nice.

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
          <https://reviews.apache.org/r/2178/#comment5878>

          Good test

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
          <https://reviews.apache.org/r/2178/#comment5879>

          White space

          • Michael

          On 2011-10-12 04:55:53, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-12 04:55:53)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2609 ----------------------------------------------------------- Great stuff. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2178/#comment5854 > Javadoc what the new param means http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2178/#comment5855 > So you made this change to make it so user doesn't have condition where they are scratching their head trying to figure why something is not being let go though TTL is four hours? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2178/#comment5856 > Is this value read often? If so you might want to cache it. Profiling these things can sometimes show up as costing. Not important but if you get to make new patch do if (test) block or if (test) { block } rather than the if (test) block you have. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java < https://reviews.apache.org/r/2178/#comment5857 > Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java < https://reviews.apache.org/r/2178/#comment5858 > Good. There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan. Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course). I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions? How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having: maxversions... delete marker .... maxversions ... delete marker .... maxversions... and so on if this facility is enabled? I won't have: as many versions as was written but we only return maxversions......delete marker...... as many versions as were written...... delete marker...... Is that right? (If you follow me). Say 'garbage collected' in the comment rather than just 'collected' http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java < https://reviews.apache.org/r/2178/#comment5859 > Bad name for a param. This is name you'd give the getter for a param named 'delete'. What is this param doing? Is this the raw attribute or is it from HCD saying keep deletes? Make it clear in javadoc. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java < https://reviews.apache.org/r/2178/#comment5860 > good http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java < https://reviews.apache.org/r/2178/#comment5861 > good http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5862 > This stuff needs to make it out into the hbase book. Good stuff. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5863 > So when I am doing a check for max versions = 3 and I have for a single cell: put ... put....delete....put If I keep delete markers and do a raw scan, I'll get: p p d If I have keep delete markers but not a raw scan I get: p p If I do not have keep delete markers and its not an ordinary scan: p p Something like that? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5864 > I'm not sure I get what this one is about. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5866 > An 'of' should be an 'if' here? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5867 > These params on an SQM are starting to build up. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5868 > I don't understand whats up here. Why we changing timerange if seePastDeleteMarker is set? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5869 > Is 'keepDeletedRows' a bad name for this variable? Should it be keepDeletedCells? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5870 > So, this is a delete kv at this stage of the processing? So, if its older than any put in any storefile on major compactions, we can let it go since has no effect? Thats good. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment5871 > This makes me nervous. All old tests pass on this code? You have a unit test to test that we are doing the right thing in here both for the old and the new handling? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java < https://reviews.apache.org/r/2178/#comment5872 > Again, bad name for param. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java < https://reviews.apache.org/r/2178/#comment5873 > Is this right? We should count it as a version if keepdeletes is on? Unless this flag is saying 'keepdeletes' and not 'this is a delete kv' as I'm interpreting it). http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < https://reviews.apache.org/r/2178/#comment5874 > No biggie... previous formatting was better. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java < https://reviews.apache.org/r/2178/#comment5875 > This belongs in HCD rather than up here in this global HConstants? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java < https://reviews.apache.org/r/2178/#comment5876 > Why we move this to the end? http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb < https://reviews.apache.org/r/2178/#comment5877 > Very nice. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java < https://reviews.apache.org/r/2178/#comment5878 > Good test http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java < https://reviews.apache.org/r/2178/#comment5879 > White space Michael On 2011-10-12 04:55:53, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-12 04:55:53) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > Great stuff.

          Thanks for the thorough review. I since wrote a bunch of more tests and that way found some more problems.
          I'll add comments below and also attach a new patch soon.

          What I do not like about this patch is that "entropy" in the ScanQueryMatcher now. There are too many of statements and the logic is very hard to follow. On the hand I have not found a way to make it simpler.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 322

          > <https://reviews.apache.org/r/2178/diff/7/?file=49403#file49403line322>

          >

          > So you made this change to make it so user doesn't have condition where they are scratching their head trying to figure why something is not being let go though TTL is four hours?

          Exactly. MinVersion = MaxVersion makes no sense. I can do that in a separate jira and for 0.92 too.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 556

          > <https://reviews.apache.org/r/2178/diff/7/?file=49403#file49403line556>

          >

          > Is this value read often? If so you might want to cache it. Profiling these things can sometimes show up as costing.

          >

          > Not important but if you get to make new patch do

          >

          > if (test) block

          >

          > or

          >

          > if (test) { bq. > block bq. > }

          >

          > rather than the

          >

          > if (test)

          > block

          >

          > you have.

          This is only read by the Store.<init>, so should be OK.
          Absolutely agree of the if layout, this somehow sneaked in here.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java, line 86

          > <https://reviews.apache.org/r/2178/diff/7/?file=49404#file49404line86>

          >

          > Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward.

          I was thinking to document this in the Javadoc of Attributes.java. There's already DEFAULT_CLUSTER_ID in puts, which has an attribute prefixes by on _, I can change that to two _. The trailing underscore does not matter, I guess.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java, line 711

          > <https://reviews.apache.org/r/2178/diff/7/?file=49404#file49404line711>

          >

          > Good.

          >

          > There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan.

          >

          > Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course).

          >

          > I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions?

          >

          > How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having:

          >

          > maxversions... delete marker .... maxversions ... delete marker .... maxversions...

          >

          > and so on if this facility is enabled?

          >

          > I won't have:

          >

          > as many versions as was written but we only return maxversions......delete marker...... as many versions as were written...... delete marker......

          >

          > Is that right? (If you follow me).

          >

          > Say 'garbage collected' in the comment rather than just 'collected'

          They way it works now, is that it is guaranteed to see all relevant deletes.
          Say VERSIONS=2 and you have
          put, put, put, delete, delete (all of the same row). You would only get the puts back, because the last put and the deletes are guaranteed to have no effect.

          Maybe also have something like: delete, delete, delete, put, delete. In that case you get everything, as it cannot be shown that any of the deletes is not significant.

          In any case, deletes do not count towards the version number of following puts or deletes.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 59

          > <https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line59>

          >

          > So when I am doing a check for max versions = 3 and I have for a single cell:

          >

          >

          > put ... put....delete....put

          >

          >

          > If I keep delete markers and do a raw scan, I'll get: p p d

          >

          > If I have keep delete markers but not a raw scan I get: p p

          >

          > If I do not have keep delete markers and its not an ordinary scan: p p

          >

          > Something like that?

          Exactly. That's why the login here is a bit convoluted now.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 60

          > <https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line60>

          >

          > I'm not sure I get what this one is about.

          This is what enabled "ordinary" scan to see rows behind a delete marker. Ted suggested to relabel those to "time range" queries.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 222

          > <https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line222>

          >

          > I don't understand whats up here. Why we changing timerange if seePastDeleteMarker is set?

          Ah yes. This is why the scanner needs to know whether we want to be able scan past delete markers. The *afterTimeRange will add all delete markers > some TS to the set of deletes and hence hide all earlier puts. In order to allow range queries before a delete marker the marker must only be included if it is inside the time range of the query.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 232

          > <https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line232>

          >

          > Is 'keepDeletedRows' a bad name for this variable? Should it be keepDeletedCells?

          yes

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 235

          > <https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line235>

          >

          > So, this is a delete kv at this stage of the processing? So, if its older than any put in any storefile on major compactions, we can let it go since has no effect? Thats good.

          For this to work I added a new meta data item to StoreFiles: the timestamp of the earliest put found. Not pretty, but I cannot think of anything else.
          This is used for family delete marker because they are always before all puts regardless of timestamp

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 243

          > <https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line243>

          >

          > This makes me nervous. All old tests pass on this code? You have a unit test to test that we are doing the right thing in here both for the old and the new handling?

          All tests pass

          This is because delete markers can now fall through and be subject to this test as well.
          Looking at that now, though, this not necessary. I'll remove it, and rerun my test.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java, line 73

          > <https://reviews.apache.org/r/2178/diff/7/?file=49408#file49408line73>

          >

          > Again, bad name for param.

          Changed already.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java, line 77

          > <https://reviews.apache.org/r/2178/diff/7/?file=49408#file49408line77>

          >

          > Is this right? We should count it as a version if keepdeletes is on? Unless this flag is saying 'keepdeletes' and not 'this is a delete kv' as I'm interpreting it).

          That would certainly make things easier. But would it be right?
          If VERSIONS is 3 and I have
          delete, put, delete, put, then the last put would be forever unreachable.

          If we think that's OK, I can simplify the code.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 487

          > <https://reviews.apache.org/r/2178/diff/7/?file=49409#file49409line487>

          >

          > No biggie... previous formatting was better.

          Heh, OK.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java, line 371

          > <https://reviews.apache.org/r/2178/diff/7/?file=49411#file49411line371>

          >

          > Why we move this to the end?

          Oops... Leftover from other attempts. Will move it back.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java, line 133

          > <https://reviews.apache.org/r/2178/diff/7/?file=49415#file49415line133>

          >

          > Good test

          I add way more tests... new patch coming soon.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 127

          > <https://reviews.apache.org/r/2178/diff/7/?file=49410#file49410line127>

          >

          > This belongs in HCD rather than up here in this global HConstants?

          Should it? It's very store file specific.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java, line 50

          > <https://reviews.apache.org/r/2178/diff/7/?file=49405#file49405line50>

          >

          > Bad name for a param. This is name you'd give the getter for a param named 'delete'.

          >

          > What is this param doing? Is this the raw attribute or is it from HCD saying keep deletes? Make it clear in javadoc.

          I changed the logic now to actually pass in the type of the kv.

          On 2011-10-15 04:00:08, Michael Stack wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 106

          > <https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line106>

          >

          > These params on an SQM are starting to build up.

          Yes... And I do not like it... At all!

          Maybe we should pass in the Store itself.
          I was also thinking about an enum for the scan type too. There are seven different cases:
          1. major colection, not keeping deleted rows
          2. major colection, keeping deleted rows
          3. minor colection, not keeping deleted rows
          4. minor colection, keeping deleted rows
          5. ordinary scan, with delete peeking
          6. ordinary scan, without delete peeking
          7. raw scan

          • Lars

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2609
          -----------------------------------------------------------

          On 2011-10-12 04:55:53, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-12 04:55:53)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-10-15 04:00:08, Michael Stack wrote: > Great stuff. Thanks for the thorough review. I since wrote a bunch of more tests and that way found some more problems. I'll add comments below and also attach a new patch soon. What I do not like about this patch is that "entropy" in the ScanQueryMatcher now. There are too many of statements and the logic is very hard to follow. On the hand I have not found a way to make it simpler. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 322 > < https://reviews.apache.org/r/2178/diff/7/?file=49403#file49403line322 > > > So you made this change to make it so user doesn't have condition where they are scratching their head trying to figure why something is not being let go though TTL is four hours? Exactly. MinVersion = MaxVersion makes no sense. I can do that in a separate jira and for 0.92 too. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 556 > < https://reviews.apache.org/r/2178/diff/7/?file=49403#file49403line556 > > > Is this value read often? If so you might want to cache it. Profiling these things can sometimes show up as costing. > > Not important but if you get to make new patch do > > if (test) block > > or > > if (test) { bq. > block bq. > } > > rather than the > > if (test) > block > > you have. This is only read by the Store.<init>, so should be OK. Absolutely agree of the if layout, this somehow sneaked in here. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java , line 86 > < https://reviews.apache.org/r/2178/diff/7/?file=49404#file49404line86 > > > Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward. I was thinking to document this in the Javadoc of Attributes.java. There's already DEFAULT_CLUSTER_ID in puts, which has an attribute prefixes by on _, I can change that to two _. The trailing underscore does not matter, I guess. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java , line 711 > < https://reviews.apache.org/r/2178/diff/7/?file=49404#file49404line711 > > > Good. > > There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan. > > Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course). > > I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions? > > How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having: > > maxversions... delete marker .... maxversions ... delete marker .... maxversions... > > and so on if this facility is enabled? > > I won't have: > > as many versions as was written but we only return maxversions......delete marker...... as many versions as were written...... delete marker...... > > Is that right? (If you follow me). > > Say 'garbage collected' in the comment rather than just 'collected' They way it works now, is that it is guaranteed to see all relevant deletes. Say VERSIONS=2 and you have put, put, put, delete, delete (all of the same row). You would only get the puts back, because the last put and the deletes are guaranteed to have no effect. Maybe also have something like: delete, delete, delete, put, delete. In that case you get everything, as it cannot be shown that any of the deletes is not significant. In any case, deletes do not count towards the version number of following puts or deletes. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 59 > < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line59 > > > So when I am doing a check for max versions = 3 and I have for a single cell: > > > put ... put....delete....put > > > If I keep delete markers and do a raw scan, I'll get: p p d > > If I have keep delete markers but not a raw scan I get: p p > > If I do not have keep delete markers and its not an ordinary scan: p p > > Something like that? Exactly. That's why the login here is a bit convoluted now. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 60 > < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line60 > > > I'm not sure I get what this one is about. This is what enabled "ordinary" scan to see rows behind a delete marker. Ted suggested to relabel those to "time range" queries. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 222 > < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line222 > > > I don't understand whats up here. Why we changing timerange if seePastDeleteMarker is set? Ah yes. This is why the scanner needs to know whether we want to be able scan past delete markers. The *afterTimeRange will add all delete markers > some TS to the set of deletes and hence hide all earlier puts. In order to allow range queries before a delete marker the marker must only be included if it is inside the time range of the query. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 232 > < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line232 > > > Is 'keepDeletedRows' a bad name for this variable? Should it be keepDeletedCells? yes On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 235 > < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line235 > > > So, this is a delete kv at this stage of the processing? So, if its older than any put in any storefile on major compactions, we can let it go since has no effect? Thats good. For this to work I added a new meta data item to StoreFiles: the timestamp of the earliest put found. Not pretty, but I cannot think of anything else. This is used for family delete marker because they are always before all puts regardless of timestamp On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 243 > < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line243 > > > This makes me nervous. All old tests pass on this code? You have a unit test to test that we are doing the right thing in here both for the old and the new handling? All tests pass This is because delete markers can now fall through and be subject to this test as well. Looking at that now, though, this not necessary. I'll remove it, and rerun my test. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java , line 73 > < https://reviews.apache.org/r/2178/diff/7/?file=49408#file49408line73 > > > Again, bad name for param. Changed already. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java , line 77 > < https://reviews.apache.org/r/2178/diff/7/?file=49408#file49408line77 > > > Is this right? We should count it as a version if keepdeletes is on? Unless this flag is saying 'keepdeletes' and not 'this is a delete kv' as I'm interpreting it). That would certainly make things easier. But would it be right? If VERSIONS is 3 and I have delete, put, delete, put, then the last put would be forever unreachable. If we think that's OK, I can simplify the code. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java , line 487 > < https://reviews.apache.org/r/2178/diff/7/?file=49409#file49409line487 > > > No biggie... previous formatting was better. Heh, OK. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java , line 371 > < https://reviews.apache.org/r/2178/diff/7/?file=49411#file49411line371 > > > Why we move this to the end? Oops... Leftover from other attempts. Will move it back. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java , line 133 > < https://reviews.apache.org/r/2178/diff/7/?file=49415#file49415line133 > > > Good test I add way more tests... new patch coming soon. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java , line 127 > < https://reviews.apache.org/r/2178/diff/7/?file=49410#file49410line127 > > > This belongs in HCD rather than up here in this global HConstants? Should it? It's very store file specific. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java , line 50 > < https://reviews.apache.org/r/2178/diff/7/?file=49405#file49405line50 > > > Bad name for a param. This is name you'd give the getter for a param named 'delete'. > > What is this param doing? Is this the raw attribute or is it from HCD saying keep deletes? Make it clear in javadoc. I changed the logic now to actually pass in the type of the kv. On 2011-10-15 04:00:08, Michael Stack wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 106 > < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line106 > > > These params on an SQM are starting to build up. Yes... And I do not like it... At all! Maybe we should pass in the Store itself. I was also thinking about an enum for the scan type too. There are seven different cases: 1. major colection, not keeping deleted rows 2. major colection, keeping deleted rows 3. minor colection, not keeping deleted rows 4. minor colection, keeping deleted rows 5. ordinary scan, with delete peeking 6. ordinary scan, without delete peeking 7. raw scan Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2609 ----------------------------------------------------------- On 2011-10-12 04:55:53, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-12 04:55:53) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-15 05:30:12.128914)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Changes
          -------

          Many more tests. Found some bugs with these tests: deletes and put may have the same TS, the scanner would think it's a duplicate row and remove it.

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1183459
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183459

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-15 05:30:12.128914) Review request for hbase, Ted Yu and Jonathan Gray. Changes ------- Many more tests. Found some bugs with these tests: deletes and put may have the same TS, the scanner would think it's a duplicate row and remove it. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1183459 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183459 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          Ted Yu added a comment -

          Continuing discussion started 11/Oct/11 06:45, I think the following check at line 321:

                if (minVersions >= maxVersions) {
          

          should at least be lifted above the check at line 318:

                if (timeToLive == HConstants.FOREVER) {
          
          Show
          Ted Yu added a comment - Continuing discussion started 11/Oct/11 06:45, I think the following check at line 321: if (minVersions >= maxVersions) { should at least be lifted above the check at line 318: if (timeToLive == HConstants.FOREVER) {
          Hide
          Ted Yu added a comment -

          I was also thinking about an enum for the scan type too. There are seven different cases:

          Since the cases span multiple dimensions (compaction, keeping deleted rows), enum is not good candidate.

          How about introducing MatcherConfig so that new parameters can be added relatively easily in the future ?

          Show
          Ted Yu added a comment - I was also thinking about an enum for the scan type too. There are seven different cases: Since the cases span multiple dimensions (compaction, keeping deleted rows), enum is not good candidate. How about introducing MatcherConfig so that new parameters can be added relatively easily in the future ?
          Hide
          Lars Hofhansl added a comment -

          @Ted, I can change the order in which the checks are made. MIN_VERSIONS make no sense without TTL, thouhg.

          @Ted and @Stack: If we indeed make all of this the default behavior a lot of the if statements and parameters in ScanQueryMatcher would go away, simplifying the logic a LOT.
          That would be mean that every store would keep deleted rows (until VERSIONS, or TTL removes them), and scanners would always be able to peek behind delete markers.

          Show
          Lars Hofhansl added a comment - @Ted, I can change the order in which the checks are made. MIN_VERSIONS make no sense without TTL, thouhg. @Ted and @Stack: If we indeed make all of this the default behavior a lot of the if statements and parameters in ScanQueryMatcher would go away, simplifying the logic a LOT. That would be mean that every store would keep deleted rows (until VERSIONS, or TTL removes them), and scanners would always be able to peek behind delete markers.
          Hide
          Ted Yu added a comment -

          I think this new feature should not be the default behavior.

          Show
          Ted Yu added a comment - I think this new feature should not be the default behavior.
          Hide
          Jonathan Gray added a comment -

          I think this new feature should not be the default behavior.

          +1

          Show
          Jonathan Gray added a comment - I think this new feature should not be the default behavior. +1
          Hide
          Lars Hofhansl added a comment -

          Fair enough. Was picking up on Stack's suggestion to have this on by default. Just means the code has to distinguish between minor and major compaction scans, raw scans, and normal user scans, all of which can be for a store with keep_delete_cells enabled or not.

          Thinking about a ScanConfig (or ScanInfo) as static inner class of Store. That would capture all immutable scan-relevant information about the Store (min/max version, family name, ttl, keep_deletes, comparator). (A MatcherConfig with all information would need to be mutable and created or changed for every scan.)
          And then maybe a ScanType enum to distinguish between compaction scans and user initiated scans.

          What about Stack's suggested in the review to include delete cells in the version count? (The only strange part with that is that the family markers are always in the beginning).
          Right now a delete cell does not increase the version count and instead "inherits" the version of the last put.

          Show
          Lars Hofhansl added a comment - Fair enough. Was picking up on Stack's suggestion to have this on by default. Just means the code has to distinguish between minor and major compaction scans, raw scans, and normal user scans, all of which can be for a store with keep_delete_cells enabled or not. Thinking about a ScanConfig (or ScanInfo) as static inner class of Store. That would capture all immutable scan-relevant information about the Store (min/max version, family name, ttl, keep_deletes, comparator). (A MatcherConfig with all information would need to be mutable and created or changed for every scan.) And then maybe a ScanType enum to distinguish between compaction scans and user initiated scans. What about Stack's suggested in the review to include delete cells in the version count? (The only strange part with that is that the family markers are always in the beginning). Right now a delete cell does not increase the version count and instead "inherits" the version of the last put.
          Hide
          Ted Yu added a comment -

          Thinking about a ScanConfig (or ScanInfo)

          ScanInfo seems to be a better name.

          And then maybe a ScanType enum

          This is good.

          a delete cell does not increase the version count

          This should be fine.

          Show
          Ted Yu added a comment - Thinking about a ScanConfig (or ScanInfo) ScanInfo seems to be a better name. And then maybe a ScanType enum This is good. a delete cell does not increase the version count This should be fine.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-17 05:32:49.376397)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Changes
          -------

          New day, new version of the patch

          o Added yet more tests.
          o Introduced class Store.ScanInfo and enum StoreScanner.ScanType to make more sense of the options passed to ScanQueryMatcher.

          This is hopefully close.

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1184947

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-17 05:32:49.376397) Review request for hbase, Ted Yu and Jonathan Gray. Changes ------- New day, new version of the patch o Added yet more tests. o Introduced class Store.ScanInfo and enum StoreScanner.ScanType to make more sense of the options passed to ScanQueryMatcher. This is hopefully close. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1184947 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2616
          -----------------------------------------------------------

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
          <https://reviews.apache.org/r/2178/#comment5911>

          Please add 'any' in front of columns and change columns to singular form.
          Maybe columns.size should be checked against 0 as well ?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
          <https://reviews.apache.org/r/2178/#comment5912>

          Much shorter and readable now.

          • Ted

          On 2011-10-17 05:32:49, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-17 05:32:49)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1184947

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2616 ----------------------------------------------------------- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java < https://reviews.apache.org/r/2178/#comment5911 > Please add 'any' in front of columns and change columns to singular form. Maybe columns.size should be checked against 0 as well ? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java < https://reviews.apache.org/r/2178/#comment5912 > Much shorter and readable now. Ted On 2011-10-17 05:32:49, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-17 05:32:49) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1184947 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-18 19:29:30.328890)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Changes
          -------

          This one passes all tests again. Added some more comments.

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 19:29:30.328890) Review request for hbase, Ted Yu and Jonathan Gray. Changes ------- This one passes all tests again. Added some more comments. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          Ted Yu added a comment -

          +1 on patch v12.

          For Scan.java, I think the following can be removed:

            // TODO: should have real flag here and check SCAN_VERSION?
          

          Javadoc for setRaw() should mention that raw scan shouldn't specify columns:

            public void setRaw(boolean raw) {
          

          For ScanQueryMatcher.java:

              // keep deleted cells: comaction or raw scan
          

          should read:

              // keep deleted cells: if compaction or raw scan
          

          The following:

              // retain deletes: minor comactions or raw scan
          

          should read:

              // retain deletes: if minor compactions or raw scan
          
          Show
          Ted Yu added a comment - +1 on patch v12. For Scan.java, I think the following can be removed: // TODO: should have real flag here and check SCAN_VERSION? Javadoc for setRaw() should mention that raw scan shouldn't specify columns: public void setRaw( boolean raw) { For ScanQueryMatcher.java: // keep deleted cells: comaction or raw scan should read: // keep deleted cells: if compaction or raw scan The following: // retain deletes: minor comactions or raw scan should read: // retain deletes: if minor compactions or raw scan
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-18 21:29:47.823962)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Changes
          -------

          Addressed all of Ted's comments. Added one more basic test for existing behavior.

          This could be the one

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:29:47.823962) Review request for hbase, Ted Yu and Jonathan Gray. Changes ------- Addressed all of Ted's comments. Added one more basic test for existing behavior. This could be the one Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/
          -----------------------------------------------------------

          (Updated 2011-10-18 21:43:38.886408)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Changes
          -------

          Arrghhh... Fixed one more Javadoc typo.

          Summary
          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
          This did not work for deletes, however. Deletes would always mask all puts in the past.
          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:
          1. There is a new flag on HColumnDescriptor enabling that behavior.
          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
          3. Do not unconditionally collect all deleted rows during a compaction.
          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.
          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs (updated)


          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing
          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:43:38.886408) Review request for hbase, Ted Yu and Jonathan Gray. Changes ------- Arrghhh... Fixed one more Javadoc typo. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs (updated) http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          Ted Yu added a comment -

          +1 on patch r12.

          Show
          Ted Yu added a comment - +1 on patch r12.
          Hide
          Lars Hofhansl added a comment -

          Accommodate HBASE-4585. (3 line change)

          This is the patch I would like to commit to trunk.

          Show
          Lars Hofhansl added a comment - Accommodate HBASE-4585 . (3 line change) This is the patch I would like to commit to trunk.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2677
          -----------------------------------------------------------

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <https://reviews.apache.org/r/2178/#comment6040>

          If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well?

          (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?)

          Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          <https://reviews.apache.org/r/2178/#comment6038>

          Why only ExplicitColumnTracker? A "delete family marker" that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right?

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment6037>

          Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier.

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment6039>

          If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()?

          • Prakash

          On 2011-10-18 21:43:38, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-18 21:43:38)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2677 ----------------------------------------------------------- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2178/#comment6040 > If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well? (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?) Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java < https://reviews.apache.org/r/2178/#comment6038 > Why only ExplicitColumnTracker? A "delete family marker" that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment6037 > Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment6039 > If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()? Prakash On 2011-10-18 21:43:38, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:43:38) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-10-19 21:16:42, Prakash Khemani wrote:

          >

          Thanks for the review Prakash.

          On 2011-10-19 21:16:42, Prakash Khemani wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 160

          > <https://reviews.apache.org/r/2178/diff/12/?file=50948#file50948line160>

          >

          > If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well?

          >

          > (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?)

          >

          > Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell?

          Gets and Scans only see deleted rows when they use a time range that ends before the resp. delete marker.
          The idea is that by using timerange [0,T+1), you can query the system for the state it was in at time T.
          If that range includes the delete marker you won't see the deleted rows, if it does not you will.

          On 2011-10-19 21:16:42, Prakash Khemani wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java, line 114

          > <https://reviews.apache.org/r/2178/diff/12/?file=50953#file50953line114>

          >

          > Why only ExplicitColumnTracker? A "delete family marker" that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right?

          During compactions the ScanWildcardColumnTracker is used, always.
          For the new "raw" scans I simply do not want to support ExplicitColumnTracker. There's a check that prevents a "raw" scan from adding any columns.

          Note that a "raw" scan is not the same as a time range scan/get. Jon G suggested that'd be a good thing to have, and it was easy to add. A "raw" scan sees everything. Including the delete markers, and could be used (for example) to do custom replication which includes delete markers and deleted rows.

          ScanWildcardColumnTracker has all the new logic to deal with delete markers and deleted rows.

          On 2011-10-19 21:16:42, Prakash Khemani wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 274

          > <https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line274>

          >

          > Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier.

          Delete marker should still be subject to this, so this is ok.

          On 2011-10-19 21:16:42, Prakash Khemani wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, line 297

          > <https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line297>

          >

          > If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()?

          no, because retainDeletesInOutput is only ever true for: a minor compaction, a memstore flush, and a raw scan. In all cases no columns can be specified.

          • Lars

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2677
          -----------------------------------------------------------

          On 2011-10-18 21:43:38, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-18 21:43:38)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-10-19 21:16:42, Prakash Khemani wrote: > Thanks for the review Prakash. On 2011-10-19 21:16:42, Prakash Khemani wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 160 > < https://reviews.apache.org/r/2178/diff/12/?file=50948#file50948line160 > > > If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well? > > (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?) > > Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell? Gets and Scans only see deleted rows when they use a time range that ends before the resp. delete marker. The idea is that by using timerange [0,T+1), you can query the system for the state it was in at time T. If that range includes the delete marker you won't see the deleted rows, if it does not you will. On 2011-10-19 21:16:42, Prakash Khemani wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java , line 114 > < https://reviews.apache.org/r/2178/diff/12/?file=50953#file50953line114 > > > Why only ExplicitColumnTracker? A "delete family marker" that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right? During compactions the ScanWildcardColumnTracker is used, always. For the new "raw" scans I simply do not want to support ExplicitColumnTracker. There's a check that prevents a "raw" scan from adding any columns. Note that a "raw" scan is not the same as a time range scan/get. Jon G suggested that'd be a good thing to have, and it was easy to add. A "raw" scan sees everything. Including the delete markers, and could be used (for example) to do custom replication which includes delete markers and deleted rows. ScanWildcardColumnTracker has all the new logic to deal with delete markers and deleted rows. On 2011-10-19 21:16:42, Prakash Khemani wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 274 > < https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line274 > > > Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier. Delete marker should still be subject to this, so this is ok. On 2011-10-19 21:16:42, Prakash Khemani wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 297 > < https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line297 > > > If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()? no, because retainDeletesInOutput is only ever true for: a minor compaction, a memstore flush, and a raw scan. In all cases no columns can be specified. Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2677 ----------------------------------------------------------- On 2011-10-18 21:43:38, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:43:38) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2684
          -----------------------------------------------------------

          Very awesome stuff, Lars. I like the new scan info/enum stuff and lots of good comments. The code in SQM.match() is a bit scary but you've done a nice job at keeping it clean and documenting the heck out of it. I was able to grok it, mostly, in not much time.

          It would be good to have some documentation somewhere about how exactly this is used. The raw scanner is pretty sweet but it's not exactly clear to me how i set those params without reading code (and converting from the logic in code back to the configs available is tricky).

          I'm +1 (trunk). But maybe wait for someone else who has reviewed this more for the final go ahead.

          • Jonathan

          On 2011-10-18 21:43:38, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-18 21:43:38)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2684 ----------------------------------------------------------- Very awesome stuff, Lars. I like the new scan info/enum stuff and lots of good comments. The code in SQM.match() is a bit scary but you've done a nice job at keeping it clean and documenting the heck out of it. I was able to grok it, mostly, in not much time. It would be good to have some documentation somewhere about how exactly this is used. The raw scanner is pretty sweet but it's not exactly clear to me how i set those params without reading code (and converting from the logic in code back to the configs available is tricky). I'm +1 (trunk). But maybe wait for someone else who has reviewed this more for the final go ahead. Jonathan On 2011-10-18 21:43:38, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:43:38) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          Lars Hofhansl added a comment -

          So we have three +1's now. Ted, Jonathan... and me

          Show
          Lars Hofhansl added a comment - So we have three +1's now. Ted, Jonathan... and me
          Hide
          Ted Yu added a comment -

          Reminder: document the (minVersions >= maxVersions) check of HColumnDescriptor in the book.

          Show
          Ted Yu added a comment - Reminder: document the (minVersions >= maxVersions) check of HColumnDescriptor in the book.
          Hide
          Jonathan Gray added a comment -

          I'm at +0.5

          Add just a bit more high-level, config-level doc somewhere and I'm a strong +1...

          Show
          Jonathan Gray added a comment - I'm at +0.5 Add just a bit more high-level, config-level doc somewhere and I'm a strong +1...
          Hide
          Lars Hofhansl added a comment -

          @Ted, ah yes... Will do.
          @Jonathan, you mean in the HBase book? Sure.

          I'll add that and post a new patch.

          Show
          Lars Hofhansl added a comment - @Ted, ah yes... Will do. @Jonathan, you mean in the HBase book? Sure. I'll add that and post a new patch.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review2694
          -----------------------------------------------------------

          Thank Lars. Was able to understand the diff after your explanations.

          • Prakash

          On 2011-10-18 21:43:38, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-18 21:43:38)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review2694 ----------------------------------------------------------- Thank Lars. Was able to understand the diff after your explanations. Prakash On 2011-10-18 21:43:38, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:43:38) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          Lars Hofhansl added a comment -
          • Added a section to the book explaining keeping deleted rows.
          • Clarified wording in minimum number of versions, and also changed the documented constraints (namely that minVersions must be < maxVersions)
          • Added a simple client side test (that also serves as an example of how one would use a raw scanner from the client side).
          Show
          Lars Hofhansl added a comment - Added a section to the book explaining keeping deleted rows. Clarified wording in minimum number of versions, and also changed the documented constraints (namely that minVersions must be < maxVersions) Added a simple client side test (that also serves as an example of how one would use a raw scanner from the client side).
          Hide
          Jonathan Gray added a comment -

          +1 to v16 for commit to trunk. You are a good man, Lars. Well done. And thanks for being patient.

          Show
          Jonathan Gray added a comment - +1 to v16 for commit to trunk. You are a good man, Lars. Well done. And thanks for being patient.
          Hide
          Lars Hofhansl added a comment -

          Committing v16 to trunk within the next 30 mins. Last chance to speak up

          Show
          Lars Hofhansl added a comment - Committing v16 to trunk within the next 30 mins. Last chance to speak up
          Hide
          Lars Hofhansl added a comment -

          Committed to trunk.

          Show
          Lars Hofhansl added a comment - Committed to trunk.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2353 (See https://builds.apache.org/job/HBase-TRUNK/2353/)
          HBASE-4536 Allow CF to retain deleted rows

          larsh :
          Files :

          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/docbkx/book.xml
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
          • /hbase/trunk/src/main/ruby/hbase/admin.rb
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2353 (See https://builds.apache.org/job/HBase-TRUNK/2353/ ) HBASE-4536 Allow CF to retain deleted rows larsh : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/docbkx/book.xml /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java /hbase/trunk/src/main/ruby/hbase/admin.rb /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review3009
          -----------------------------------------------------------

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2178/#comment6726>

          Hi Lars, Isn't this early-out problematic? It doesn't take into account min-versions. It doesn't take into account the newly introduced keepDeletedCells mode.

          • Prakash

          On 2011-10-18 21:43:38, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-18 21:43:38)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review3009 ----------------------------------------------------------- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2178/#comment6726 > Hi Lars, Isn't this early-out problematic? It doesn't take into account min-versions. It doesn't take into account the newly introduced keepDeletedCells mode. Prakash On 2011-10-18 21:43:38, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:43:38) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-11-02 06:44:54, Prakash Khemani wrote:

          > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java, lines 210-212

          > <https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line210>

          >

          > Hi Lars, Isn't this early-out problematic? It doesn't take into account min-versions. It doesn't take into account the newly introduced keepDeletedCells mode.

          The early out only happens when miversions is not set. Check out *ColumnTracker

          • Lars

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2178/#review3009
          -----------------------------------------------------------

          On 2011-10-18 21:43:38, Lars Hofhansl wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2178/

          -----------------------------------------------------------

          (Updated 2011-10-18 21:43:38)

          Review request for hbase, Ted Yu and Jonathan Gray.

          Summary

          -------

          HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.

          This did not work for deletes, however. Deletes would always mask all puts in the past.

          This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.

          These rows are still subject to TTL and/or VERSIONS.

          This changes the following:

          1. There is a new flag on HColumnDescriptor enabling that behavior.

          2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.

          3. Do not unconditionally collect all deleted rows during a compaction.

          4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

          The change is small'ish, but the logic is intricate, so please review carefully.

          This addresses bug HBASE-4536.

          https://issues.apache.org/jira/browse/HBASE-4536

          Diffs

          -----

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362

          http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362

          Diff: https://reviews.apache.org/r/2178/diff

          Testing

          -------

          All tests pass now.

          Thanks,

          Lars

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-11-02 06:44:54, Prakash Khemani wrote: > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , lines 210-212 > < https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line210 > > > Hi Lars, Isn't this early-out problematic? It doesn't take into account min-versions. It doesn't take into account the newly introduced keepDeletedCells mode. The early out only happens when miversions is not set. Check out *ColumnTracker Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/#review3009 ----------------------------------------------------------- On 2011-10-18 21:43:38, Lars Hofhansl wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2178/ ----------------------------------------------------------- (Updated 2011-10-18 21:43:38) Review request for hbase, Ted Yu and Jonathan Gray. Summary ------- HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around. This did not work for deletes, however. Deletes would always mask all puts in the past. This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows. These rows are still subject to TTL and/or VERSIONS. This changes the following: 1. There is a new flag on HColumnDescriptor enabling that behavior. 2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker. 3. Do not unconditionally collect all deleted rows during a compaction. 4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows. The change is small'ish, but the logic is intricate, so please review carefully. This addresses bug HBASE-4536 . https://issues.apache.org/jira/browse/HBASE-4536 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362 Diff: https://reviews.apache.org/r/2178/diff Testing ------- All tests pass now. Thanks, Lars

            People

            • Assignee:
              Lars Hofhansl
              Reporter:
              Lars Hofhansl
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development