Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 0.8 beta 1
    • Component/s: Core
    • Labels:
      None

      Description

      Obey timestampOfLastDelete during reconciliation.

        Issue Links

          Activity

          Hide
          Kelvin Kakugawa added a comment -

          timestampOfLastDelete was added and 90% implemented. Missed this last logic.

          Show
          Kelvin Kakugawa added a comment - timestampOfLastDelete was added and 90% implemented. Missed this last logic.
          Hide
          Kelvin Kakugawa added a comment -

          add logic to reconcile; test logic.

          Show
          Kelvin Kakugawa added a comment - add logic to reconcile; test logic.
          Hide
          Kelvin Kakugawa added a comment -

          modified patch w/o dependencies on expiring counter column code.

          Show
          Kelvin Kakugawa added a comment - modified patch w/o dependencies on expiring counter column code.
          Hide
          Sylvain Lebresne added a comment -

          Before commenting on the patch itself, I want to use this ticket to recall that counter deletes are intrinsically broken. It has been said already but I'll use this comment to explain in more depth how so and keep a trace of this for the record.

          First, I'll use the following notation:

            c(x, 3)@[4, 2] - for a counter column of name x, value 3, timestamp 4 and timestampOfLastDelete 2 (I'll use -1 as the min timestampOfLastDelete).
          

          and

            d(x)@[5] - for a tombstone of name x and timestamp 5
          

          And now suppose that the following inserts are done (in that order):

             c(x, 1)@[1, -1]
             d(x)@[2]
             c(x, 1)@[3, -1]
          

          If these inserts are resolved in that order, everything is fine:

             c(x, 1)@[1, -1]
           + d(x)@[2]
          => d(x)@[2]
           + c(x, 1)@[3, -1]
          => c(x, 1)@[3, 2]
          

          However, some reordering don't work. Namely, if you merge the two counts together, before you merge one of the count with the delete:

             c(x, 1)@[1, -1]
           + c(x, 1)@[3, -1]
          => c(x, 2)@[3, -1]
           + d(x)@[2]
          => c(x, 2)@[3, 2]
          

          The problem is, the resolve operation is not commutative when you consider counter columns and tombstones. But Cassandra rely heavily on resolve being commutative (as a side note, I never understood the reason of the CommutativeType terminology in the code. It suggest that regular columns are not commutative, while they are as far as resolve is concerned. Resolve is not idempotent on counters however).

          Not only is there no guarantee on which order the insert will be received by each node, but even if they are in the right order, there is no guarantee that (minor) compaction won't screw up this.

          Hence I think that there is not much guarantee we can give on deletes. The only one I can think of is that when on issue a delete, you must wait to issue any following update that the delete have reach all the nodes and all of them have been fully compacted.

          That being said, we can keep counter deletes. It's at least useful for cases where you know that you won't reuse a counter ever and want to get rid of the disk space. But I would add a very strong warning to its documentation.

          Lastly, the deletion of full counter rows or super columns suffers the same problem for the same reason.

          Show
          Sylvain Lebresne added a comment - Before commenting on the patch itself, I want to use this ticket to recall that counter deletes are intrinsically broken. It has been said already but I'll use this comment to explain in more depth how so and keep a trace of this for the record. First, I'll use the following notation: c(x, 3)@[4, 2] - for a counter column of name x, value 3, timestamp 4 and timestampOfLastDelete 2 (I'll use -1 as the min timestampOfLastDelete). and d(x)@[5] - for a tombstone of name x and timestamp 5 And now suppose that the following inserts are done (in that order): c(x, 1)@[1, -1] d(x)@[2] c(x, 1)@[3, -1] If these inserts are resolved in that order, everything is fine: c(x, 1)@[1, -1] + d(x)@[2] => d(x)@[2] + c(x, 1)@[3, -1] => c(x, 1)@[3, 2] However, some reordering don't work. Namely, if you merge the two counts together, before you merge one of the count with the delete: c(x, 1)@[1, -1] + c(x, 1)@[3, -1] => c(x, 2)@[3, -1] + d(x)@[2] => c(x, 2)@[3, 2] The problem is, the resolve operation is not commutative when you consider counter columns and tombstones. But Cassandra rely heavily on resolve being commutative (as a side note, I never understood the reason of the CommutativeType terminology in the code. It suggest that regular columns are not commutative, while they are as far as resolve is concerned. Resolve is not idempotent on counters however). Not only is there no guarantee on which order the insert will be received by each node, but even if they are in the right order, there is no guarantee that (minor) compaction won't screw up this. Hence I think that there is not much guarantee we can give on deletes. The only one I can think of is that when on issue a delete, you must wait to issue any following update that the delete have reach all the nodes and all of them have been fully compacted. That being said, we can keep counter deletes. It's at least useful for cases where you know that you won't reuse a counter ever and want to get rid of the disk space. But I would add a very strong warning to its documentation. Lastly, the deletion of full counter rows or super columns suffers the same problem for the same reason.
          Hide
          Sylvain Lebresne added a comment -

          +1 on the patch in the spirit of "let's make delete work as often as we can".

          However, I realized that the first 'if' of CounterColumn.reconcile() is dead-code since a CounterColumn is never markedForDelete(). It would be nice to remove it while submitting this

          Show
          Sylvain Lebresne added a comment - +1 on the patch in the spirit of "let's make delete work as often as we can". However, I realized that the first 'if' of CounterColumn.reconcile() is dead-code since a CounterColumn is never markedForDelete(). It would be nice to remove it while submitting this
          Hide
          Kelvin Kakugawa added a comment -

          Good point, Sylvain!

          Attached a new patch w/ the irrelevant code removed.

          Show
          Kelvin Kakugawa added a comment - Good point, Sylvain! Attached a new patch w/ the irrelevant code removed.
          Hide
          Sylvain Lebresne added a comment -

          So +1 on the new patch

          Show
          Sylvain Lebresne added a comment - So +1 on the new patch
          Hide
          Jonathan Ellis added a comment -

          committed

          Show
          Jonathan Ellis added a comment - committed
          Hide
          Hudson added a comment -

          Integrated in Cassandra #709 (See https://hudson.apache.org/hudson/job/Cassandra/709/)
          add delete support for counters
          patch by Kelvin Kakugawa; reviewed by slebresne for CASSANDRA-2101

          Show
          Hudson added a comment - Integrated in Cassandra #709 (See https://hudson.apache.org/hudson/job/Cassandra/709/ ) add delete support for counters patch by Kelvin Kakugawa; reviewed by slebresne for CASSANDRA-2101

            People

            • Assignee:
              Kelvin Kakugawa
              Reporter:
              Kelvin Kakugawa
              Reviewer:
              Sylvain Lebresne
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development