Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Delete is incomplete in hbase. Whats there is inconsistent. Deleted records currently persist and are never cleaned up. This issue is about making delete behavior coherent across gets, scans and compaction.

      Below is from a bit of back and forth between Jim and myself where Jim takes a stab at outlining a model for delete taking inspiration from how Digital's versioned file system used work:

      Let's say you have 5 versions with timestamps T1, T2, ..., T5 where
      timestamps are increasing from T1 to T5 (so T5 is the newest).
      
      Before any deletes occur, if you don't specify a timestamp and request N
      versions, you should get T5 first, then T4, T3, ... until you have
      reached N or you run out of versions.
      
      Now add deletes:
      
      (In the following, timestamp refers to the timestamp associated with
      the delete operation)
      
      1. If no timestamp is specified we are deleting the latest version.
         If a get or scanner specifies that it wants N versions, then it 
         should get T4, T3, ..., until we have N versions or we run out of
         older versions. After compaction, the deletion record and T5 should
         be elided from the HStore.
      
      2. If a timestamp is specified and it exactly matches a version (say
         T4) and a get or scanner requests N versions, then the client
         receives T5, T3, T2, ... until we satisfy N or run out of versions.
         After a compaction, the deletion record and T4 should be elided
         from the HStore.
      
      3. If a timestamp is specified and does not exactly match a version,
         it means delete every version older than this timestamp. If the
         timestamp is greater than T5 all versions are considered to be
         deleted and a get or a scanner will return no results even if 
         the get or scanner specify an older time. This is consistent
         with the concept of delete all versions older than timestamp.
         After a compaction, the delete record and all the values should
         be elided.
      
         If the specified timestamp falls between two older versions (say
         T4 and T3) then T3, T2 and T1 are considered to be deleted (again
         this is all versions older than timestamp). A get or scanner
         that specifies no time but requests N versions can only get T5
         and T4. A get or scanner that requests a time of T3 or earlier
         will get no results because those versions are deleted. After
         a compaction, the deletion record and the deleted versions
         are elided from the HStore.
      
      1. delete1.patch
        64 kB
        stack
      2. delete2.patch
        84 kB
        stack
      3. delete3.patch
        98 kB
        stack
      4. delete4.patch
        99 kB
        stack

        Issue Links

          Activity

          Hide
          stack added a comment -

          Committed v4. Resolving the issue.

          Show
          stack added a comment - Committed v4. Resolving the issue.
          Show
          Hadoop QA added a comment - +1 http://issues.apache.org/jira/secure/attachment/12365442/delete4.patch applied and successfully tested against trunk revision r573777. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/722/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/722/console
          Hide
          stack added a comment -

          Retry against hudson

          Show
          stack added a comment - Retry against hudson
          Hide
          stack added a comment -

          Print out value passed assertion and change assertion to allow that may 3 or 4 versions of cell instead of just asserting that there are 4 in testCompaction.

          Show
          stack added a comment - Print out value passed assertion and change assertion to allow that may 3 or 4 versions of cell instead of just asserting that there are 4 in testCompaction.
          Hide
          stack added a comment -

          I suppose its possible that compaction could start after the cacheflush thread finishes (Doesn't on my old single-processor linux box).

          Show
          stack added a comment - I suppose its possible that compaction could start after the cacheflush thread finishes (Doesn't on my old single-processor linux box).
          Hide
          Hadoop QA added a comment -

          -1, build or testing failed

          2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12365426/delete3.patch against trunk revision r573777.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/721/testReport/
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/721/console

          Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

          Show
          Hadoop QA added a comment - -1, build or testing failed 2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12365426/delete3.patch against trunk revision r573777. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/721/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/721/console Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.
          Hide
          stack added a comment -

          Passes all tests locally. Try Hudson.

          Show
          stack added a comment - Passes all tests locally. Try Hudson.
          Hide
          stack added a comment -

          Failing TestScanner2 was because after rework, a row of all deleted values would stop the scanner (I heart unit tests). This patch version 3 includes fix. Below is an updated commit message. Includes note of new fix and edits of previous message.

          HADOOP-1784 delete
          Fix scanners and gets so they work properly in presence of deletes.
          Added a deleteAll to remove all cells equal to or older than passed
          timestamp.  Fixed compaction so deleted cells do not make it out into
          compacted output.  Ensure also that versions > column max are dropped
          compacting.
          
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
              (Loader): Renamed 'Loader' Interface as 'Incommon' -- as in the
              methods HTable and HRegion have in common -- because now does
              more than 'loading'.  Added getters, delete, deleteAll and scanners
              and amended the implementations of Incommon particular for HTable
              and HRegion.
              (createTableDescriptor): Add override so can specify column versions.
              (FlushCache): Added an interface that can be implemented by things
              that flush their cache (e.g. HRegion and HTable).
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
              (flushcache): Added. Flushes all regionserver regions.
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java
              (testDelete, testTimestampScanning): Refactored so local tests that
              against an HRegion could be run -- via the Incommon interface
              -- from the client side with HTable inside testTimestamps.
              (doTestDelete, assertOnlyLatest, assertVersions, 
                doTestTimestampScanning, assertScanContentTimestamp, put,
                delete): Added.
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/MultiRegionTable.java
              Renamed Loader interface as Incommon.
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompaction.java
              Add assertions that on compaction deleted rows are dropped and that
              versions > than column maximum versions are also dropped.
              (setUp, tearDown): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
              javadoc edits
              (getFilesToCompact): Changed so list of tiles is ordered from newest to
              oldest.  Was doing oldest to newest.
              (compact): Keep running list per row of whats been deleted. Used checking
              later encountered cells.  If key matches deleted cell seen earlier, the
              later cell is not added to compacted output.
              (isDeleted): Added.  Checks running list of deletes found locally but
              also consults memcache in case it has deletes for current cell (and
              therefore we should not return this version of the cell).
              (get): Keep running list per row of whats been deleted. Used checking
              later encountered cells.
              (hasEnoughVersions, getKeys): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
              (batchUpdate): If passed timestamp if LATEST_TIMESTAMP, then all
              puts get the server's current timestamp.  Deletes get special handling.
              We fetch the 'latest' cell of same row and column and using ITS
              timestamp, we write a delete record.  Otherwise, works as previous.
              (deleteAll): Added.
          M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HScannerInterface.java
              javadoc.  Changed next method param from TreeMap to more generic SortedMap
              so could have HInternalScannerInterface extend this Interface.
              A minor inconvenience is that the Close in this base interface throws
              IOException whereas HInternalScannerInterface does not (had to add
              'useless' try/catch in two close locations).  Fix..
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java
             (EMPTY_TEXT, ALL_VERSIONS, LATEST_TIMESTAMP): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java
              javadoc edit.
              (deleteAll): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
              Removed a few superfluous List allocations.
              (getKeys, isDeleted): Added.  isDeleted is called when looking at
              store cells to see if memcache has a delete to X them out.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInterface.java
              javadoc edit.
              (deleteAll): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HInternalScannerInterface.java
              Made it inherit from HScannerInterface.  Remove next and close (They
              are inherited).
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HAbstractScanner.java
              (next): Changed param from TreeMap to SortedMap.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java
              (equals): Works if passed bytes too.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchOperation.java
              Redid batch operations as enums. Made constructors cascade.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchUpdate.java
              javadoc edit and fixed eclipse complaints about param names being same
              as data member names..
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
             (get): Fix count of versions across memcache and stores.  We were not
             doing version aggregating counts properly.  Also fix so properly ordered.
             (deleteAll, update, getKeys, deleteMultiple): Added.
             (next): Changed param from TreeMap to SortedMap.  Keep running list of
             deleted cells used looking at later versions.  If a cell is 'deleted',
             set the 'filtered' flag to true else scan would not go past the 'deleted'
             row.  If no results found, do not return true (that there are more 
             possible values).  Made the test of chosenTimestamp >= rather than just
             > when checking to see if more (0 may be a legit timestamp).
             (commit): Added a commit override used in unit tests emulating
             batchUpdate operation in HRegionServer.
          
          Show
          stack added a comment - Failing TestScanner2 was because after rework, a row of all deleted values would stop the scanner (I heart unit tests). This patch version 3 includes fix. Below is an updated commit message. Includes note of new fix and edits of previous message. HADOOP-1784 delete Fix scanners and gets so they work properly in presence of deletes. Added a deleteAll to remove all cells equal to or older than passed timestamp. Fixed compaction so deleted cells do not make it out into compacted output. Ensure also that versions > column max are dropped compacting. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java (Loader): Renamed 'Loader' Interface as 'Incommon' -- as in the methods HTable and HRegion have in common -- because now does more than 'loading'. Added getters, delete, deleteAll and scanners and amended the implementations of Incommon particular for HTable and HRegion. (createTableDescriptor): Add override so can specify column versions. (FlushCache): Added an interface that can be implemented by things that flush their cache (e.g. HRegion and HTable). M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java (flushcache): Added. Flushes all regionserver regions. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java (testDelete, testTimestampScanning): Refactored so local tests that against an HRegion could be run -- via the Incommon interface -- from the client side with HTable inside testTimestamps. (doTestDelete, assertOnlyLatest, assertVersions, doTestTimestampScanning, assertScanContentTimestamp, put, delete): Added. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MultiRegionTable.java Renamed Loader interface as Incommon. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompaction.java Add assertions that on compaction deleted rows are dropped and that versions > than column maximum versions are also dropped. (setUp, tearDown): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java javadoc edits (getFilesToCompact): Changed so list of tiles is ordered from newest to oldest. Was doing oldest to newest. (compact): Keep running list per row of whats been deleted. Used checking later encountered cells. If key matches deleted cell seen earlier, the later cell is not added to compacted output. (isDeleted): Added. Checks running list of deletes found locally but also consults memcache in case it has deletes for current cell (and therefore we should not return this version of the cell). (get): Keep running list per row of whats been deleted. Used checking later encountered cells. (hasEnoughVersions, getKeys): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java (batchUpdate): If passed timestamp if LATEST_TIMESTAMP, then all puts get the server's current timestamp. Deletes get special handling. We fetch the 'latest' cell of same row and column and using ITS timestamp, we write a delete record. Otherwise, works as previous. (deleteAll): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HScannerInterface.java javadoc. Changed next method param from TreeMap to more generic SortedMap so could have HInternalScannerInterface extend this Interface. A minor inconvenience is that the Close in this base interface throws IOException whereas HInternalScannerInterface does not (had to add 'useless' try / catch in two close locations). Fix.. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java (EMPTY_TEXT, ALL_VERSIONS, LATEST_TIMESTAMP): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java javadoc edit. (deleteAll): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java Removed a few superfluous List allocations. (getKeys, isDeleted): Added. isDeleted is called when looking at store cells to see if memcache has a delete to X them out. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInterface.java javadoc edit. (deleteAll): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HInternalScannerInterface.java Made it inherit from HScannerInterface. Remove next and close (They are inherited). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HAbstractScanner.java (next): Changed param from TreeMap to SortedMap. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java (equals): Works if passed bytes too. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchOperation.java Redid batch operations as enums. Made constructors cascade. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchUpdate.java javadoc edit and fixed eclipse complaints about param names being same as data member names.. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java (get): Fix count of versions across memcache and stores. We were not doing version aggregating counts properly. Also fix so properly ordered. (deleteAll, update, getKeys, deleteMultiple): Added. (next): Changed param from TreeMap to SortedMap. Keep running list of deleted cells used looking at later versions. If a cell is 'deleted', set the 'filtered' flag to true else scan would not go past the 'deleted' row. If no results found, do not return true (that there are more possible values). Made the test of chosenTimestamp >= rather than just > when checking to see if more (0 may be a legit timestamp). (commit): Added a commit override used in unit tests emulating batchUpdate operation in HRegionServer.
          Hide
          stack added a comment -

          Here's a patch to finish the delete work. TestScanner2 is not passing. Need to investigate...

          HADOOP-1784 delete
          Fix scanners and gets so they work properly in presence of deletes.
          Added a deleteAll to remove all cells equal to or older than passed
          timestamp.  Fixed compaction so deleted cells do not make it into
          compacted output.
          
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
              (Loader): Renamed 'Loader' Interface as 'Incommon' because now does
              more than 'loading'.  Added getters, delete, deleteAll and scanners
              and amended implementations for HTable and HRegion.
              (createTableDescriptor): Add override so can specifice column versions.
              (FlushCache): Added an interface that can be implemented by things
              that flush their cache (e.g. HRegion and HTable).
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
              (flushcache): Added.
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java
              (testDelete, testTimestampScanning): Refactored so tests could be
              run from client side inside testTimestamps.
              (doTestDelete, assertOnlyLatest, assertVersions, 
                doTestTimestampScanning, assertScanContentTimestamp, put,
                delete): Added.
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/MultiRegionTable.java
              Renamed Loader interface as Incommon.
          M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompaction.java
              Add assertions that on compaction deleted rows are dropped and that
              versions > than column maximum versions are also dropped.
              (setUp, tearDown): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
              (getFilesToCompact): Changed so list returned newest to oldest.
              Was doing oldest to newest.
              (compact): Keep running list per row of whats been deleted. Used checking
              later encountered cells.  If key matches deleted cell seen earlier, the
              later cell is not added to compacted output.
              (isDeleted): Added.  Used while getting and scanning store files
              in case there is a delete of a specific cell over in memcache.
              (get): Keep running list per row of whats been deleted. Used checking
              later encountered cells.  If later key matches a deleted cell, the cell
              is not returned.  Also, consult memcache.  Memcache could have a delete
              for the record-to-return.
              (hasEnoughVersions, getKeys): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
              (deleteAll): Added.
          M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HScannerInterface.java
              javadoc.  Changed next param TreeMap to SortedMap so could have
              HInternalScannerInterface inherit from this base..
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java
             (EMPTY_TEXT): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java
              javadoc.
              (deleteAll): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
              (getKeys, isDeleted): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInterface.java
              (deleteAll): Added.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HInternalScannerInterface.java
              Made it inherit from HScannerInterface.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HAbstractScanner.java
              (next): Changed param from TreeMap to SortedMap.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java
              (equals): Allow passing byte [].
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchOperation.java
              Redid batch operations as enums.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchUpdate.java
              javadoc.
          M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
             (get): Fix count of versions across memcache and stores.  We were not
             doing version count properly.  Also fix so properly ordered.
             (next): Changed param from TreeMap to SortedMap.  Keep running list of
             deleted cells used looking at later versions.
          
          Show
          stack added a comment - Here's a patch to finish the delete work. TestScanner2 is not passing. Need to investigate... HADOOP-1784 delete Fix scanners and gets so they work properly in presence of deletes. Added a deleteAll to remove all cells equal to or older than passed timestamp. Fixed compaction so deleted cells do not make it into compacted output. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java (Loader): Renamed 'Loader' Interface as 'Incommon' because now does more than 'loading'. Added getters, delete, deleteAll and scanners and amended implementations for HTable and HRegion. (createTableDescriptor): Add override so can specifice column versions. (FlushCache): Added an interface that can be implemented by things that flush their cache (e.g. HRegion and HTable). M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java (flushcache): Added. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java (testDelete, testTimestampScanning): Refactored so tests could be run from client side inside testTimestamps. (doTestDelete, assertOnlyLatest, assertVersions, doTestTimestampScanning, assertScanContentTimestamp, put, delete): Added. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MultiRegionTable.java Renamed Loader interface as Incommon. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompaction.java Add assertions that on compaction deleted rows are dropped and that versions > than column maximum versions are also dropped. (setUp, tearDown): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java (getFilesToCompact): Changed so list returned newest to oldest. Was doing oldest to newest. (compact): Keep running list per row of whats been deleted. Used checking later encountered cells. If key matches deleted cell seen earlier, the later cell is not added to compacted output. (isDeleted): Added. Used while getting and scanning store files in case there is a delete of a specific cell over in memcache. (get): Keep running list per row of whats been deleted. Used checking later encountered cells. If later key matches a deleted cell, the cell is not returned. Also, consult memcache. Memcache could have a delete for the record-to- return . (hasEnoughVersions, getKeys): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java (deleteAll): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HScannerInterface.java javadoc. Changed next param TreeMap to SortedMap so could have HInternalScannerInterface inherit from this base.. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java (EMPTY_TEXT): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java javadoc. (deleteAll): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java (getKeys, isDeleted): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInterface.java (deleteAll): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HInternalScannerInterface.java Made it inherit from HScannerInterface. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HAbstractScanner.java (next): Changed param from TreeMap to SortedMap. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java (equals): Allow passing byte []. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchOperation.java Redid batch operations as enums. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/io/BatchUpdate.java javadoc. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java (get): Fix count of versions across memcache and stores. We were not doing version count properly. Also fix so properly ordered. (next): Changed param from TreeMap to SortedMap. Keep running list of deleted cells used looking at later versions.
          Hide
          stack added a comment -

          Patch of work so far. Still have clean up of delete records on compaction to do.

          Show
          stack added a comment - Patch of work so far. Still have clean up of delete records on compaction to do.
          Hide
          stack added a comment -

          After study, 1. and 2. in the above are straight-forward. 3. as described, as a means of deleting all behind a particular timestamp is expensive. Here's why:

          On every get, I need to check the future. That is, I need to read rows in front of the currently specified row/column/timestamp combination to look see if there is a delete record with the same row/column but a timestamp ahead of the stipulated one. If one is found, then matching records should not be returned because they have been 'deleted'.

          Instead, lets build the delete-all-versions-behind-a-specified-timestamp using the basic delete mechanism, the facility whereby a cell is X-d out by the presence of a delete cell of the exact same row/column/timestamp appearing ahead of the non-null cell. The API currently has a delete that takes a column name and on commit you specify the timestamp. This deletes a single cell value. To facilitate bulk delete operations, to the API we'll add a deleteAll. Internally, this will find all cells that match the row/column specified and delete all cells of equal-to or older timestamps.

          Later folks might want to do things like only delete the X oldest revisions or only delete all of exactly the same timestamp but I'll wait until its asked for before attempting an implementation.

          Show
          stack added a comment - After study, 1. and 2. in the above are straight-forward. 3. as described, as a means of deleting all behind a particular timestamp is expensive. Here's why: On every get, I need to check the future. That is, I need to read rows in front of the currently specified row/column/timestamp combination to look see if there is a delete record with the same row/column but a timestamp ahead of the stipulated one. If one is found, then matching records should not be returned because they have been 'deleted'. Instead, lets build the delete-all-versions-behind-a-specified-timestamp using the basic delete mechanism, the facility whereby a cell is X-d out by the presence of a delete cell of the exact same row/column/timestamp appearing ahead of the non-null cell. The API currently has a delete that takes a column name and on commit you specify the timestamp. This deletes a single cell value. To facilitate bulk delete operations, to the API we'll add a deleteAll. Internally, this will find all cells that match the row/column specified and delete all cells of equal-to or older timestamps. Later folks might want to do things like only delete the X oldest revisions or only delete all of exactly the same timestamp but I'll wait until its asked for before attempting an implementation.

            People

            • Assignee:
              Unassigned
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development