Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 0.7 beta 1
    • Component/s: Core
    • Labels:
      None

      Description

      In the latest CASSANDRA-580 patch the timestamp used for conflict resolution has been replaced by a Clock object that can contain either timestamp or a context byte array. That change allows for other conflict resolution techniques to be used.
      The change can be broken out and submitted here in order to make CASSANDRA-580 more manageable.

      1. 0004-v9-Update-of-thrift-generated-files.patch
        336 kB
        Sylvain Lebresne
      2. 0003-v9-Update-system-tests.patch
        46 kB
        Sylvain Lebresne
      3. 0002-v9-Update-unit-tests.patch
        72 kB
        Sylvain Lebresne
      4. 0001-v9-Introduces-wrapper-for-clock-types-generalizing-time.patch
        112 kB
        Sylvain Lebresne
      5. 0004-Update-of-thrift-generated-files.patch
        346 kB
        Sylvain Lebresne
      6. 0001-v8-Introduces-wrapper-for-clock-types-generalizing-time.patch
        112 kB
        Sylvain Lebresne
      7. 0003-v6-Update-System-tests.patch
        44 kB
        Johan Oskarsson
      8. 0002-v6-Update-unit-tests.patch
        68 kB
        Johan Oskarsson
      9. 0001-v6-Introduces-wrapper-for-clock-types-generalizing-time.patch
        105 kB
        Johan Oskarsson
      10. 0001-v5-Introduces-wrapper-for-clock-types-generalizing-time.patch
        115 kB
        Sylvain Lebresne
      11. 0001-v4-Introduces-wrapper-for-clock-types-generalizing-time.patch
        115 kB
        Kelvin Kakugawa
      12. 0001-v2-Introduces-wrapper-for-clock-types-generalizing-time.patch
        115 kB
        Sylvain Lebresne
      13. 0003-Update-System-tests.patch
        46 kB
        Sylvain Lebresne
      14. 0002-Update-unit-tests.patch
        75 kB
        Sylvain Lebresne
      15. 0001-Introduces-wrapper-for-clock-types-generalizing-time.patch
        112 kB
        Sylvain Lebresne

        Activity

        Hide
        Sylvain Lebresne added a comment -

        Attached patches introduces the clock wrapper but only introduces the TimestampClock (which is the point
        of this particular ticket). I've mainly extracted the relevant parts from 580 (from a version more recent than
        the one attached to jira though), rebased against trunk and adapted post-1058 (but with a few other
        modifications, in particular to make avro happy).
        The merit goes to the author of 580, but the eventual screw up may well be mine.

        Show
        Sylvain Lebresne added a comment - Attached patches introduces the clock wrapper but only introduces the TimestampClock (which is the point of this particular ticket). I've mainly extracted the relevant parts from 580 (from a version more recent than the one attached to jira though), rebased against trunk and adapted post-1058 (but with a few other modifications, in particular to make avro happy). The merit goes to the author of 580, but the eventual screw up may well be mine.
        Hide
        Johan Oskarsson added a comment -

        Had a quick look at the patch, overall it looks good. Couple of minor points:

        cassandra.thrift still has a few references to timestamp instead of clock. Would also be good to add some more user friendly documentation to the Clock struct.

        Only the cassandra.avpr file is changed, the genavro is not. I don't if we use the genavro file though, can Eric Evans enlighten us?

        A class level javadoc would be good for the ClockType and TimestampClock classes, explaining what they are for. Esp in the timestamp one it might help users when trying to pick the correct type of clock to use.

        The TimestampClock compareTo unnecessarily creates two Long objects.

        In FBUtilities it would be good to have some javadoc for atomicSetMax. I'm not a big fan of the "goto" in there either, but perhaps that's just me.

        Show
        Johan Oskarsson added a comment - Had a quick look at the patch, overall it looks good. Couple of minor points: cassandra.thrift still has a few references to timestamp instead of clock. Would also be good to add some more user friendly documentation to the Clock struct. Only the cassandra.avpr file is changed, the genavro is not. I don't if we use the genavro file though, can Eric Evans enlighten us? A class level javadoc would be good for the ClockType and TimestampClock classes, explaining what they are for. Esp in the timestamp one it might help users when trying to pick the correct type of clock to use. The TimestampClock compareTo unnecessarily creates two Long objects. In FBUtilities it would be good to have some javadoc for atomicSetMax. I'm not a big fan of the "goto" in there either, but perhaps that's just me.
        Hide
        Sylvain Lebresne added a comment -

        cassandra.thrift still has a few references to timestamp instead of clock. Would also be good to add some more user friendly documentation to the Clock struct.

        A class level javadoc would be good for the ClockType and TimestampClock classes, explaining what they are for. Esp in the timestamp one it might help users when trying to pick the correct type of clock to use.

        In FBUtilities it would be good to have some javadoc for atomicSetMax. I'm not a big fan of the "goto" in there either, but perhaps that's just me.

        Attached v2 patch try to improve on those a bit. But as of this ticket, the clock generalization is more boilerplate
        than anything else, and I think we will have a better story to tell in the comments when we have more than one type
        of clock.
        As for where the user is concerned, I think we will need a wiki page that explains the clock choice in details, but again, it feels
        like it will make more sense with the actual patch for version (and/or counters) clocks.

        Only the cassandra.avpr file is changed, the genavro is not. I don't if we use the genavro file though, can Eric Evans enlighten us?

        I bet I don't know much more than you do. But everything seems to work fine without the genavro file (the avro system test (as small as
        they are) pass). Feels like it's not necessary to me but I may be wrong.

        The TimestampClock compareTo unnecessarily creates two Long objects.

        Yes, that was stupid of me. In it's patch, Kelvin was doing the conversation by converting to double. I'm unsure this was
        necessary (but maybe Kelvin has a good reason). Anyway, I don't know what I was smoking when I wrote that but the
        attached patch correct that as well.

        Thanks for the comments

        Show
        Sylvain Lebresne added a comment - cassandra.thrift still has a few references to timestamp instead of clock. Would also be good to add some more user friendly documentation to the Clock struct. A class level javadoc would be good for the ClockType and TimestampClock classes, explaining what they are for. Esp in the timestamp one it might help users when trying to pick the correct type of clock to use. In FBUtilities it would be good to have some javadoc for atomicSetMax. I'm not a big fan of the "goto" in there either, but perhaps that's just me. Attached v2 patch try to improve on those a bit. But as of this ticket, the clock generalization is more boilerplate than anything else, and I think we will have a better story to tell in the comments when we have more than one type of clock. As for where the user is concerned, I think we will need a wiki page that explains the clock choice in details, but again, it feels like it will make more sense with the actual patch for version (and/or counters) clocks. Only the cassandra.avpr file is changed, the genavro is not. I don't if we use the genavro file though, can Eric Evans enlighten us? I bet I don't know much more than you do. But everything seems to work fine without the genavro file (the avro system test (as small as they are) pass). Feels like it's not necessary to me but I may be wrong. The TimestampClock compareTo unnecessarily creates two Long objects. Yes, that was stupid of me. In it's patch, Kelvin was doing the conversation by converting to double. I'm unsure this was necessary (but maybe Kelvin has a good reason). Anyway, I don't know what I was smoking when I wrote that but the attached patch correct that as well. Thanks for the comments
        Hide
        Eric Evans added a comment -

        Only the cassandra.avpr file is changed, the genavro is not. I don't if we use the genavro file though, can Eric Evans enlighten us?

        I have a few i's to dot and t's to cross before making cassandra.genavro the canonical source for the avro protocol. Until then, you can safely ignore it; I'll migrate over any changes made to cassandra.avpr.

        Show
        Eric Evans added a comment - Only the cassandra.avpr file is changed, the genavro is not. I don't if we use the genavro file though, can Eric Evans enlighten us? I have a few i's to dot and t's to cross before making cassandra.genavro the canonical source for the avro protocol. Until then, you can safely ignore it; I'll migrate over any changes made to cassandra.avpr.
        Hide
        Kelvin Kakugawa added a comment -

        update patch to reflect ttl

        Show
        Kelvin Kakugawa added a comment - update patch to reflect ttl
        Hide
        Kelvin Kakugawa added a comment -

        slightly modified atomicSetMax's docstring

        Thanks Johan and Sylvain for pointing out the improvement to TimestampClock.compare() and making the fix.

        Show
        Kelvin Kakugawa added a comment - slightly modified atomicSetMax's docstring Thanks Johan and Sylvain for pointing out the improvement to TimestampClock.compare() and making the fix.
        Hide
        Sylvain Lebresne added a comment -

        Attached v5 patch correct a bug that Johan pointed out to me.
        When initializing the system table, a few column were wrongly inserted
        with a timestamp of Long.MIN_VALUE instead of 0.

        I've also propagated the ttl fix to avro. It made me realize that it was missing
        from the patch for #1109 btw. Do you want me to extract that specific part
        and attach a patch to #1109 (I don't know if I can reopen it as it was commited).

        Show
        Sylvain Lebresne added a comment - Attached v5 patch correct a bug that Johan pointed out to me. When initializing the system table, a few column were wrongly inserted with a timestamp of Long.MIN_VALUE instead of 0. I've also propagated the ttl fix to avro. It made me realize that it was missing from the patch for #1109 btw. Do you want me to extract that specific part and attach a patch to #1109 (I don't know if I can reopen it as it was commited).
        Hide
        Johan Oskarsson added a comment -

        The v5 of the patch now fails to apply cleanly to trunk, can you cut a new patch for us Sylvain?

        Show
        Johan Oskarsson added a comment - The v5 of the patch now fails to apply cleanly to trunk, can you cut a new patch for us Sylvain?
        Hide
        Johan Oskarsson added a comment -

        Updated patches to latest trunk

        Show
        Johan Oskarsson added a comment - Updated patches to latest trunk
        Hide
        Jonathan Ellis added a comment -

        what is the point of messing with the sample KS/CF defs? is that just search and replace out of control a bit?

        Show
        Jonathan Ellis added a comment - what is the point of messing with the sample KS/CF defs? is that just search and replace out of control a bit?
        Hide
        Jonathan Ellis added a comment -

        using "fluent" thrift api is considered more idiomatic now:

        + Clock thrift_clock = new Clock().setTimestamp(timestampMicros());

        instead of

        + Clock thrift_clock = new Clock();
        + thrift_clock.setTimestamp(timestampMicros());

        Show
        Jonathan Ellis added a comment - using "fluent" thrift api is considered more idiomatic now: + Clock thrift_clock = new Clock().setTimestamp(timestampMicros()); instead of + Clock thrift_clock = new Clock(); + thrift_clock.setTimestamp(timestampMicros());
        Hide
        Jonathan Ellis added a comment -

        avoiding NPE by pushing it up to the caller is not preferred, e.g.

        + if (cfMetaData == null)
        + return null;
        + return cfMetaData.clockType;

        better:

        assert cfMetaData != null;

        Show
        Jonathan Ellis added a comment - avoiding NPE by pushing it up to the caller is not preferred, e.g. + if (cfMetaData == null) + return null; + return cfMetaData.clockType; better: assert cfMetaData != null;
        Hide
        Jonathan Ellis added a comment -

        + public static ColumnSerializer serializer(ClockType clockType)

        the convention everywhere is that serializer() takes no arguments, the serialization op should inspect the clocktype instead

        Show
        Jonathan Ellis added a comment - + public static ColumnSerializer serializer(ClockType clockType) the convention everywhere is that serializer() takes no arguments, the serialization op should inspect the clocktype instead
        Hide
        Jonathan Ellis added a comment -

        why does IClock not implement Comparable instead of adding ClockRelationship?

        Show
        Jonathan Ellis added a comment - why does IClock not implement Comparable instead of adding ClockRelationship?
        Hide
        Jonathan Ellis added a comment -

        SuperColumn has some minor whitespace issues.

        Show
        Jonathan Ellis added a comment - SuperColumn has some minor whitespace issues.
        Hide
        Jonathan Ellis added a comment -

        List<IClock> clocks = new LinkedList<IClock>();
        clocks.add(newClock);
        newClock = oldClock.getSuperset(clocks);

        is better written as

        newClock = oldClock.getSuperset(Arrays.asList(oldClock));

        ... but more importantly doesn't

        IClock oldClock = atomic.getAndSet(newClock);

        temporarily set the reference to the new value (even it is smaller than the old), before fixing it to be the max? that would be a bug.

        Show
        Jonathan Ellis added a comment - List<IClock> clocks = new LinkedList<IClock>(); clocks.add(newClock); newClock = oldClock.getSuperset(clocks); is better written as newClock = oldClock.getSuperset(Arrays.asList(oldClock)); ... but more importantly doesn't IClock oldClock = atomic.getAndSet(newClock); temporarily set the reference to the new value (even it is smaller than the old), before fixing it to be the max? that would be a bug.
        Hide
        Sylvain Lebresne added a comment -

        I'll a pass over the patch with you suggestions, but a few comments first.

        why does IClock not implement Comparable instead of adding ClockRelationship?

        That's because version clocks can be disjoint (but timestamp clocks don't).
        I admit, without version clock in the patch, it may feel weird. This is one of those
        "this patch is a step before the next patch from 580".
        And I've decided to remove the DISJOINT case from ClockRelationship enum from
        this patch because having it will require the addition of the notion of reconcilier, but this
        seems beyond this patch to me.

        ... but more importantly doesn't

        IClock oldClock = atomic.getAndSet(newClock);

        temporarily set the reference to the new value (even it is smaller than the old), before fixing it to be the max? that would be a bug.

        It do look like a bug (but maybe Kelvin could also comment on this one). Just
        wanted to point out that the two previous atomicSetMax (for interger and long)
        seem to be hit by the same curse. I can fix it for this patch, but should we
        open a ticket for the other two ?

        + public static ColumnSerializer serializer(ClockType clockType)

        the convention everywhere is that serializer() takes no arguments, the serialization op should inspect the clocktype instead

        That not true, SuperColumn.serializer takes an AbstractType
        But more seriously, I'll admit I'm not sure what's the best way to do that.
        For serialization it's easy, we have the clock of the column. But for
        deserialization, we would need somehow to write that on disk. I see two
        options (at least two simple):
        1) we use the 2 bytes used for the deletion & expiration flags. Feels a bit
        like a hack though.
        2) we add a new field in the serialized form of the column.
        And the superColumn serializer would also need to know the clock type to
        deserialize the markedDeletedAt field. Here we can't use any existing flags.
        Overall, it seems to me like a waste of perfectly good cpu cycles and space
        just to avoid having an argument to a function (not a lot, granted, but still).
        Unless ... I'm not completely awaken yet and I miss something much easier.

        Show
        Sylvain Lebresne added a comment - I'll a pass over the patch with you suggestions, but a few comments first. why does IClock not implement Comparable instead of adding ClockRelationship? That's because version clocks can be disjoint (but timestamp clocks don't). I admit, without version clock in the patch, it may feel weird. This is one of those "this patch is a step before the next patch from 580". And I've decided to remove the DISJOINT case from ClockRelationship enum from this patch because having it will require the addition of the notion of reconcilier, but this seems beyond this patch to me. ... but more importantly doesn't IClock oldClock = atomic.getAndSet(newClock); temporarily set the reference to the new value (even it is smaller than the old), before fixing it to be the max? that would be a bug. It do look like a bug (but maybe Kelvin could also comment on this one). Just wanted to point out that the two previous atomicSetMax (for interger and long) seem to be hit by the same curse. I can fix it for this patch, but should we open a ticket for the other two ? + public static ColumnSerializer serializer(ClockType clockType) the convention everywhere is that serializer() takes no arguments, the serialization op should inspect the clocktype instead That not true, SuperColumn.serializer takes an AbstractType But more seriously, I'll admit I'm not sure what's the best way to do that. For serialization it's easy, we have the clock of the column. But for deserialization, we would need somehow to write that on disk. I see two options (at least two simple): 1) we use the 2 bytes used for the deletion & expiration flags. Feels a bit like a hack though. 2) we add a new field in the serialized form of the column. And the superColumn serializer would also need to know the clock type to deserialize the markedDeletedAt field. Here we can't use any existing flags. Overall, it seems to me like a waste of perfectly good cpu cycles and space just to avoid having an argument to a function (not a lot, granted, but still). Unless ... I'm not completely awaken yet and I miss something much easier.
        Hide
        Jonathan Ellis added a comment -

        > the two previous atomicSetMax (for interger and long) seem to be hit by the same curse

        you're right, let's fix those too. separate ticket or this one is fine.

        > SuperColumn.serializer takes an AbstractType

        you're right. i withdraw my objection, if that's the most natural way to do it.

        [this is why having serializer() be a class method instead of having an instance serialize(DataOutput) method sucks...]

        Show
        Jonathan Ellis added a comment - > the two previous atomicSetMax (for interger and long) seem to be hit by the same curse you're right, let's fix those too. separate ticket or this one is fine. > SuperColumn.serializer takes an AbstractType you're right. i withdraw my objection, if that's the most natural way to do it. [this is why having serializer() be a class method instead of having an instance serialize(DataOutput) method sucks...]
        Hide
        Sylvain Lebresne added a comment -

        Attached patch change the atomicSetMax (all of them) and
        correct the other thing pointed by Jonathan (thanks to Johan).

        The other patch is just the thrift generated file.

        Show
        Sylvain Lebresne added a comment - Attached patch change the atomicSetMax (all of them) and correct the other thing pointed by Jonathan (thanks to Johan). The other patch is just the thrift generated file.
        Hide
        Jonathan Ellis added a comment -

        I think atomicSetMax needs to use compareAndSet to really be correct, or it's still possible that another thread sets the value to a larger one after our compare but before our getAndSet, resulting in us still temporarily decreasing the value

        Show
        Jonathan Ellis added a comment - I think atomicSetMax needs to use compareAndSet to really be correct, or it's still possible that another thread sets the value to a larger one after our compare but before our getAndSet, resulting in us still temporarily decreasing the value
        Hide
        Sylvain Lebresne added a comment -

        I think atomicSetMax needs to use compareAndSet to really be correct, or it's still possible that another thread sets the value to a larger one after our compare but before our getAndSet, resulting in us still temporarily decreasing the value

        You're absolutely right. Hopefully new attached patch get it right

        Show
        Sylvain Lebresne added a comment - I think atomicSetMax needs to use compareAndSet to really be correct, or it's still possible that another thread sets the value to a larger one after our compare but before our getAndSet, resulting in us still temporarily decreasing the value You're absolutely right. Hopefully new attached patch get it right
        Hide
        Jonathan Ellis added a comment -

        atomicSetMax looks good now.

        Getting this test failure:

        form:git-trunk jonathan$ nosetests -x test/system/test_thrift_server.py
        ...............................................E
        ======================================================================
        ERROR: Test that column ttled do expires using batch_mutate
        ----------------------------------------------------------------------
        ...
        AttributeError: 'int' object has no attribute 'timestamp'
        
        Show
        Jonathan Ellis added a comment - atomicSetMax looks good now. Getting this test failure: form:git-trunk jonathan$ nosetests -x test/system/test_thrift_server.py ...............................................E ====================================================================== ERROR: Test that column ttled do expires using batch_mutate ---------------------------------------------------------------------- ... AttributeError: ' int ' object has no attribute 'timestamp'
        Hide
        Sylvain Lebresne added a comment -

        Don't know what I had smoked earlier on but my patch wasn't even compiling.
        Removing and adding a new v9 version of the patches. I've rebased the
        system tests, so that unit and system tests passes.

        Show
        Sylvain Lebresne added a comment - Don't know what I had smoked earlier on but my patch wasn't even compiling. Removing and adding a new v9 version of the patches. I've rebased the system tests, so that unit and system tests passes.
        Hide
        Jonathan Ellis added a comment -

        committed!

        Show
        Jonathan Ellis added a comment - committed!
        Hide
        Hudson added a comment -

        Integrated in Cassandra #446 (See http://hudson.zones.apache.org/hudson/job/Cassandra/446/)
        add Clock structure as a prelude to full vector clock support. patch by Sylvain Lebresne and Kelvin Kakugawa for CASSANDRA-1070; reviewed by johano and jbellis

        Show
        Hudson added a comment - Integrated in Cassandra #446 (See http://hudson.zones.apache.org/hudson/job/Cassandra/446/ ) add Clock structure as a prelude to full vector clock support. patch by Sylvain Lebresne and Kelvin Kakugawa for CASSANDRA-1070 ; reviewed by johano and jbellis

          People

          • Assignee:
            Sylvain Lebresne
            Reporter:
            Johan Oskarsson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development