Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 2.0 beta 1
    • Component/s: None
    • Labels:
      None

      Description

      Asynchronous triggers is a basic mechanism to implement various use cases of asynchronous execution of application code at database side. For example to support indexes and materialized views, online analytics, push-based data propagation.

      Please find the motivation, triggers description and list of applications:
      http://maxgrinev.com/2010/07/23/extending-cassandra-with-asynchronous-triggers/

      An example of using triggers for indexing:
      http://maxgrinev.com/2010/07/23/managing-indexes-in-cassandra-using-async-triggers/

      Implementation details are attached.

      1. 0001-1311-v3.patch
        37 kB
        Vijay
      2. trunk-984391-update2.txt
        67 kB
        Martin Hentschel
      3. HOWTO-PatchAndRunTriggerExample-update1.txt
        0.6 kB
        Martin Hentschel
      4. ImplementationDetails-update1.pdf
        71 kB
        Martin Hentschel
      5. trunk-984391-update1.txt
        59 kB
        Martin Hentschel
      6. HOWTO-PatchAndRunTriggerExample.txt
        0.5 kB
        Martin Hentschel
      7. ImplementationDetails.pdf
        72 kB
        Maxim Grinev
      8. trunk-967053.txt
        62 kB
        Maxim Grinev

        Issue Links

          Activity

          Hide
          Patrick McFadin added a comment -

          I'm going to have to object one more time to storing a jar file in the file system. With large scale deployments, this is going to be a disaster waiting to happen. One last plea for https://issues.apache.org/jira/browse/CASSANDRA-4954 ?

          Show
          Patrick McFadin added a comment - I'm going to have to object one more time to storing a jar file in the file system. With large scale deployments, this is going to be a disaster waiting to happen. One last plea for https://issues.apache.org/jira/browse/CASSANDRA-4954 ?
          Hide
          Vijay added a comment -

          Committed, and created CASSANDRA-5574 to add examples. Thanks!

          Show
          Vijay added a comment - Committed, and created CASSANDRA-5574 to add examples. Thanks!
          Hide
          Jonathan Ellis added a comment -

          Nit: would like to see javadoc for TriggerExecutor methods.

          Otherwise LGTM!

          I posted the sample to https://github.com/Vijay2win/inverted-index, i am really happy to move it to contrib

          (I meant "examples," not "contrib." Old memories...)

          Show
          Jonathan Ellis added a comment - Nit: would like to see javadoc for TriggerExecutor methods. Otherwise LGTM! I posted the sample to https://github.com/Vijay2win/inverted-index , i am really happy to move it to contrib (I meant "examples," not "contrib." Old memories...)
          Hide
          Vijay added a comment -

          Hi Jonathan,

          Removed LinkedList allocation in v3 and pushed to https://github.com/Vijay2win/cassandra/commits/1311-v3

          Let's also create follow-up tickets for loading new triggers from the directory at runtime

          we already have JMX to reload from the filesystem and CASSANDRA-4949

          I posted the sample to https://github.com/Vijay2win/inverted-index, i am really happy to move it to contrib, will address it in a different ticket.

          If everything else is fine i will get this committed to 2.0.

          Show
          Vijay added a comment - Hi Jonathan, Removed LinkedList allocation in v3 and pushed to https://github.com/Vijay2win/cassandra/commits/1311-v3 Let's also create follow-up tickets for loading new triggers from the directory at runtime we already have JMX to reload from the filesystem and CASSANDRA-4949 I posted the sample to https://github.com/Vijay2win/inverted-index , i am really happy to move it to contrib, will address it in a different ticket. If everything else is fine i will get this committed to 2.0.
          Hide
          Jonathan Ellis added a comment -

          Looks reasonable. Let's make sure we're not doing any additional allocation on the "no triggers" path though. Let's also create follow-up tickets for loading new triggers from the directory at runtime, and for creating an example trigger in contrib/.

          Show
          Jonathan Ellis added a comment - Looks reasonable. Let's make sure we're not doing any additional allocation on the "no triggers" path though. Let's also create follow-up tickets for loading new triggers from the directory at runtime, and for creating an example trigger in contrib/.
          Hide
          Vijay added a comment -

          Hi Jonathan, As suggested offline i have added set<triggers> support in v3, rebased and pushed it to https://github.com/Vijay2win/cassandra/commits/1311-v3

          Thanks!

          Show
          Vijay added a comment - Hi Jonathan, As suggested offline i have added set<triggers> support in v3, rebased and pushed it to https://github.com/Vijay2win/cassandra/commits/1311-v3 Thanks!
          Hide
          Abhijit Dhariya added a comment -

          like to know by when this trigger feature will be available?

          Show
          Abhijit Dhariya added a comment - like to know by when this trigger feature will be available?
          Show
          Jonathan Ellis added a comment - Some good background on classloaders: http://zeroturnaround.com/labs/reloading-objects-classes-classloaders/ http://zeroturnaround.com/labs/rjc201/
          Hide
          Vijay added a comment -

          Pushed v2 to https://github.com/Vijay2win/cassandra/tree/1311-v2
          Also pushed example which i used for testing to https://github.com/Vijay2win/inverted-index, Thanks!

          Show
          Vijay added a comment - Pushed v2 to https://github.com/Vijay2win/cassandra/tree/1311-v2 Also pushed example which i used for testing to https://github.com/Vijay2win/inverted-index , Thanks!
          Hide
          Deepak Nulu added a comment -

          I agree with rektide de la fey and B. Todd Burruss. A generic notification mechanism would be a lot more useful. Hopefully the notification mechanism being used for triggers can be exposed in a generic way.

          Show
          Deepak Nulu added a comment - I agree with rektide de la fey and B. Todd Burruss. A generic notification mechanism would be a lot more useful. Hopefully the notification mechanism being used for triggers can be exposed in a generic way.
          Hide
          B. Todd Burruss added a comment -

          in response to the previous comment, i added CASSANDRA-5173, regarding a feature that i needed at one time

          Show
          B. Todd Burruss added a comment - in response to the previous comment, i added CASSANDRA-5173 , regarding a feature that i needed at one time
          Hide
          rektide de la fey added a comment -

          Hi, this ticket seems well underway and I apologize if chiming in like this is inappropriate.

          Rather than trying to define work to do as a part of the trigger (as conventional databases often do), my chief interest would be simply noting or recording certain conditions as they occur, in a way that external systems can themselves trigger responsive actions to be taken (hopefully in a low-latency fashion).

          Decoupling the detection from the follow up processing will make for a far more flexible system, and allows for general reactivity in the Cassandra world, rather than trying to define a good enough suite of actions can be taken in the DB layer proper. Providing hooks which external applications can use to do asynchronous trigger actions will also be critical for responsive/reactive system design, freeing systems from having to create their own event logs, and allowing them to simply monitor what's going on inside the database.

          The topic of finding out what execution engine to make, what to do when one is triggered, is a huge huge huge undertaking which will certainly not be done right. Please please please, decouple the problem into it's component pieces as I am requesting: a reactive trigger, which can detect changes and raise notices, and secondarily, an execution engine inside Cassandra which can subscribe to event sources and perform operations when changes are detected. It's vital that external applications have a means to detect & react to changes, and doing triggers internally only takes on too small & narcissistic a problem, and carries with it the weight of having to build an execution engine to do work on the trigger.

          Show
          rektide de la fey added a comment - Hi, this ticket seems well underway and I apologize if chiming in like this is inappropriate. Rather than trying to define work to do as a part of the trigger (as conventional databases often do), my chief interest would be simply noting or recording certain conditions as they occur, in a way that external systems can themselves trigger responsive actions to be taken (hopefully in a low-latency fashion). Decoupling the detection from the follow up processing will make for a far more flexible system, and allows for general reactivity in the Cassandra world, rather than trying to define a good enough suite of actions can be taken in the DB layer proper. Providing hooks which external applications can use to do asynchronous trigger actions will also be critical for responsive/reactive system design, freeing systems from having to create their own event logs, and allowing them to simply monitor what's going on inside the database. The topic of finding out what execution engine to make, what to do when one is triggered, is a huge huge huge undertaking which will certainly not be done right. Please please please, decouple the problem into it's component pieces as I am requesting: a reactive trigger, which can detect changes and raise notices, and secondarily, an execution engine inside Cassandra which can subscribe to event sources and perform operations when changes are detected. It's vital that external applications have a means to detect & react to changes, and doing triggers internally only takes on too small & narcissistic a problem, and carries with it the weight of having to build an execution engine to do work on the trigger.
          Hide
          Ahmet AKYOL added a comment -

          How about stored procedures? Triggers can't be without them Indeed, I don't believe they are a good fit in C*'s distributed nature( even triggers ??). I think, the magic words here are once again "eventually-consistent".

          Anyway, IMHO, Redis's scripting is also a way to go, maybe a parallel way only as stored procedures. Redis(2.6.x) has a built-in Lua interpreter, C* can use Rhino. Redis's eval, evalsha and script load commands can be used in CQL.

          You can also check the little redis book chapter 5.

          Show
          Ahmet AKYOL added a comment - How about stored procedures? Triggers can't be without them Indeed, I don't believe they are a good fit in C*'s distributed nature( even triggers ??). I think, the magic words here are once again "eventually-consistent". Anyway, IMHO, Redis's scripting is also a way to go, maybe a parallel way only as stored procedures. Redis(2.6.x) has a built-in Lua interpreter, C* can use Rhino. Redis's eval , evalsha and script load commands can be used in CQL. You can also check the little redis book chapter 5.
          Hide
          Tupshin Harper added a comment -

          + 0.5 on explicit groovy support, but I think the first priority should be good general purpose support for any jvm language. Commenting on that over in #4954

          Show
          Tupshin Harper added a comment - + 0.5 on explicit groovy support, but I think the first priority should be good general purpose support for any jvm language. Commenting on that over in #4954
          Hide
          Edward Capriolo added a comment -

          What do you guys think about a triggers in groovy? We do that here in our infrastructure. You can support multiple backends for dynamic classloading, "jar" "groovy" "closure" etc. Groovy is fairly efficient and it does not involved logging jars around. Groovy has a @GRAB annotation for example. I still like Patrick's idea but whenever you have physical jars somewhere you really have the same operational overhead. Arguably getting the jars into cassandra is more complex then just putting jars into a folder across a group of servers. Since getting jars into folder is easily accomplished.

          Show
          Edward Capriolo added a comment - What do you guys think about a triggers in groovy? We do that here in our infrastructure. You can support multiple backends for dynamic classloading, "jar" "groovy" "closure" etc. Groovy is fairly efficient and it does not involved logging jars around. Groovy has a @GRAB annotation for example. I still like Patrick's idea but whenever you have physical jars somewhere you really have the same operational overhead. Arguably getting the jars into cassandra is more complex then just putting jars into a folder across a group of servers. Since getting jars into folder is easily accomplished.
          Hide
          Patrick McFadin added a comment -
          Show
          Patrick McFadin added a comment - Done: https://issues.apache.org/jira/browse/CASSANDRA-4954 Carry on.
          Hide
          Jonathan Ellis added a comment -

          +1 to follow on ticket. Let's keep this to getting the core functionality working and then we can deal with operations separately.

          Show
          Jonathan Ellis added a comment - +1 to follow on ticket. Let's keep this to getting the core functionality working and then we can deal with operations separately.
          Hide
          Tupshin Harper added a comment -

          +1 to exploring deploying within a CF. Had been thinking something similar myself. I would like to keep the scope of this ticket reasonable, though. How about another ticket for that as a follow-on feature?

          Show
          Tupshin Harper added a comment - +1 to exploring deploying within a CF. Had been thinking something similar myself. I would like to keep the scope of this ticket reasonable, though. How about another ticket for that as a follow-on feature?
          Hide
          Matthew Brown added a comment - - edited

          Or don't implement triggers with code that runs within Cass. Specify an network endpoint to call with a predefined message containing all details of the triggering event. Define trigger with connect and read timeouts and retry instructions (retry on connect failure, don't retry on read timeout, retry x times, random back off between x-y seconds, etc.)

          Show
          Matthew Brown added a comment - - edited Or don't implement triggers with code that runs within Cass. Specify an network endpoint to call with a predefined message containing all details of the triggering event. Define trigger with connect and read timeouts and retry instructions (retry on connect failure, don't retry on read timeout, retry x times, random back off between x-y seconds, etc.)
          Hide
          Patrick McFadin added a comment -

          I love the idea of having triggers but I'm less than enthusiastic about adding operation overhead in deploying jar files to every node. When you are talking about a cluster with 100s of nodes, that's going to be a lot of files to copy around.

          Here's a radical idea. Why not store the jar file in a CF?

          • The jars will be distributed and available to all nodes in the cluster.
          • When backing up and restoring a node, this won't add any extra steps.
          • When new nodes come online, everything for the trigger will be a part of the bootstrap.

          I'm saying this strictly from an operations standpoint.

          Show
          Patrick McFadin added a comment - I love the idea of having triggers but I'm less than enthusiastic about adding operation overhead in deploying jar files to every node. When you are talking about a cluster with 100s of nodes, that's going to be a lot of files to copy around. Here's a radical idea. Why not store the jar file in a CF? The jars will be distributed and available to all nodes in the cluster. When backing up and restoring a node, this won't add any extra steps. When new nodes come online, everything for the trigger will be a part of the bootstrap. I'm saying this strictly from an operations standpoint.
          Hide
          Jonathan Ellis added a comment -

          The same as any other atomic batch when you've disabled commitlog for any of the CFs involved. (Atomic batch guarantees that the writes are sent to the replicas, but if you've disabled commitlog then it cannot guarantee that it is durable.)

          Show
          Jonathan Ellis added a comment - The same as any other atomic batch when you've disabled commitlog for any of the CFs involved. (Atomic batch guarantees that the writes are sent to the replicas, but if you've disabled commitlog then it cannot guarantee that it is durable.)
          Hide
          Edward Capriolo added a comment -

          What are the semantics for triggers when the commit log is being skipped?

          Show
          Edward Capriolo added a comment - What are the semantics for triggers when the commit log is being skipped?
          Hide
          Nate McCall added a comment - - edited

          Currently there is a JMX to load the new jars and we also watch triggers directory every minute to looking for new JAR's, I am inclined to removing the watch part for safety and let the user call the JMX to reload the jar's.

          In the use cases I see for this, a timer would not give me enough control in orchestrating an update to the trigger code. I would much prefer JMX - could be more easily hooked into nodetool for all at once as well.

          (Edit) thought about this update orchestration for a minute, created CASSANDRA-4949 for using nodetool for such.

          Show
          Nate McCall added a comment - - edited Currently there is a JMX to load the new jars and we also watch triggers directory every minute to looking for new JAR's, I am inclined to removing the watch part for safety and let the user call the JMX to reload the jar's. In the use cases I see for this, a timer would not give me enough control in orchestrating an update to the trigger code. I would much prefer JMX - could be more easily hooked into nodetool for all at once as well. (Edit) thought about this update orchestration for a minute, created CASSANDRA-4949 for using nodetool for such.
          Hide
          Vijay added a comment - - edited

          I pushed the initial version of triggers to https://github.com/Vijay2win/cassandra/tree/1311 for a review...

          • User can implement Itriggers and drop the jar into $CASSANDRA_HOME/triggers
          • Patch implements a custom Class loader, which will load the classes in an order, it first looks for the trigger classes in triggers Directory and if it cannot find the required classes needed to complete the the operation (iTrigger.agument) it looks for the class in the parent Class loader.
            • This buys us 2 things, user can drop all his dependencies in the directory (kind of sandboxed).
            • Every time we want to load a new jar, a new CustomCL will be loaded and the old one is left for GC (So classes associated with the old CL can be freed up).
            • This should help a bit in avoiding OOM in the perm gen.
          • Batches with both RowMutations and Counters will throw an exception, because the MutateAtomic is not allowed on counters anyways...
          • Currently there is a JMX to load the new jars and we also watch triggers directory every minute to looking for new JAR's, I am inclined to removing the watch part for safety and let the user call the JMX to reload the jar's.

          TODO: Need to write more test cases.... Working on it.

          Show
          Vijay added a comment - - edited I pushed the initial version of triggers to https://github.com/Vijay2win/cassandra/tree/1311 for a review... User can implement Itriggers and drop the jar into $CASSANDRA_HOME/triggers Patch implements a custom Class loader, which will load the classes in an order, it first looks for the trigger classes in triggers Directory and if it cannot find the required classes needed to complete the the operation (iTrigger.agument) it looks for the class in the parent Class loader. This buys us 2 things, user can drop all his dependencies in the directory (kind of sandboxed). Every time we want to load a new jar, a new CustomCL will be loaded and the old one is left for GC (So classes associated with the old CL can be freed up). This should help a bit in avoiding OOM in the perm gen. Batches with both RowMutations and Counters will throw an exception, because the MutateAtomic is not allowed on counters anyways... Currently there is a JMX to load the new jars and we also watch triggers directory every minute to looking for new JAR's, I am inclined to removing the watch part for safety and let the user call the JMX to reload the jar's. TODO: Need to write more test cases.... Working on it.
          Hide
          Jonathan Ellis added a comment -

          Probably makes more sense to keep the trigger at the table level and pass it key + CF instance, then.

          Show
          Jonathan Ellis added a comment - Probably makes more sense to keep the trigger at the table level and pass it key + CF instance, then.
          Hide
          Vijay added a comment -

          CREATE TRIGGER ON table EXECUTE 'com.my.company.TriggerClass'

          The problem is that RowMutation is on multiple column families and it will be better to support Triggers on a Keyspace level.

          Show
          Vijay added a comment - CREATE TRIGGER ON table EXECUTE 'com.my.company.TriggerClass' The problem is that RowMutation is on multiple column families and it will be better to support Triggers on a Keyspace level.
          Hide
          Jonathan Ellis added a comment -

          it might be necessary to have a separate directory that the users can drop their jars into

          That sounds totally reasonable.

          DROP TRIGGER is good though

          +1

          Show
          Jonathan Ellis added a comment - it might be necessary to have a separate directory that the users can drop their jars into That sounds totally reasonable. DROP TRIGGER is good though +1
          Hide
          Edward Capriolo added a comment -

          Yes whenever you dynamic class load perm gen can be an issue. But likely the loading and unloading is not going to be very common not as often as a JSP compile for sure, so load and unload would likely not be a perm gen issue.

          Yes DROP TRIGGER is good though.

          Show
          Edward Capriolo added a comment - Yes whenever you dynamic class load perm gen can be an issue. But likely the loading and unloading is not going to be very common not as often as a JSP compile for sure, so load and unload would likely not be a perm gen issue. Yes DROP TRIGGER is good though.
          Hide
          Nate McCall added a comment -

          That brings up a point - what about DROP TRIGGER semantics? I'd call that 'minimum essential' as well.

          We should bring up the trigger class in a separate class loader (or some other method) so they can be reloaded.

          Appealing, but two non-trivial issues with dynamic classloading, IME:

          • leaks that keep the classloader from getting collected and end up with an OOM in PermGen that takes down the whole JVM (number of 'public static final' refs in cassandra code base could/will be linked to will exacerbate this)
          • coordinating such in a cluster == hard (tomcat folks never really got it right)

          You could get all OSGi, but that's a mess unto itself.

          Otherwise, super glad to see this issue resurrected

          Show
          Nate McCall added a comment - That brings up a point - what about DROP TRIGGER semantics? I'd call that 'minimum essential' as well. We should bring up the trigger class in a separate class loader (or some other method) so they can be reloaded. Appealing, but two non-trivial issues with dynamic classloading, IME: leaks that keep the classloader from getting collected and end up with an OOM in PermGen that takes down the whole JVM (number of 'public static final' refs in cassandra code base could/will be linked to will exacerbate this) coordinating such in a cluster == hard (tomcat folks never really got it right) You could get all OSGi, but that's a mess unto itself. Otherwise, super glad to see this issue resurrected
          Hide
          Brian ONeill added a comment -

          Ed, you took the words right out of my mouth. If possible, the classloading should provide isolation of dependencies. We're struggling with this right now w/ Storm. (See: https://github.com/nathanmarz/storm/issues/115)

          It would be great if our solution accounted for this from the get go. To do this, it might be necessary to have a separate directory that the users can drop their jars into. If it complicates things too much however or delays the implementation, I'd be fine with an initial version that didn't provide isolation between dependencies.

          (BTW – great to see progress on this)

          Show
          Brian ONeill added a comment - Ed, you took the words right out of my mouth. If possible, the classloading should provide isolation of dependencies. We're struggling with this right now w/ Storm. (See: https://github.com/nathanmarz/storm/issues/115 ) It would be great if our solution accounted for this from the get go. To do this, it might be necessary to have a separate directory that the users can drop their jars into. If it complicates things too much however or delays the implementation, I'd be fine with an initial version that didn't provide isolation between dependencies. (BTW – great to see progress on this)
          Hide
          Edward Capriolo added a comment -

          We should bring up the trigger class in a separate class loader (or some other method) so they can be reloaded. I have done this before in Java and in groovy using the http://groovy.codehaus.org/api/groovy/lang/GroovyClassLoader.html. I say this because if a trigger needs to be redeployed we should not have to stop the entire cluster to do so.

          Show
          Edward Capriolo added a comment - We should bring up the trigger class in a separate class loader (or some other method) so they can be reloaded. I have done this before in Java and in groovy using the http://groovy.codehaus.org/api/groovy/lang/GroovyClassLoader.html . I say this because if a trigger needs to be redeployed we should not have to stop the entire cluster to do so.
          Hide
          Jonathan Ellis added a comment - - edited

          Update: CASSANDRA-4285 is done for Cassandra 1.2. This takes care of the main architectural obstacle for coordinator-based triggers as I outlined above:

          Triggers would be allowed to turn a write into a batch, or a batch into a modified batch.

          E.g., in the twissandra example, when a user adds a tweet, you could read his followers list and add an insert into each of their timelines to the batch.

          So here's what I think could be accomplished fairly easily:

          public interface ITrigger
          {
              public Collection<RowMutation> augment(RowMutation update);
          }
          

          Hook this in to the StorageProxy mutation path; if it returns more than one row, switch to mutateAtomic if we're not already part of an atomic batch. If it returns an empty collection, skip it.

          Commentary:

          • Each row of a batch would be augmented individually, but all the trigger modifications together would be part of the same final batch.
          • Returning the row unmodified is expected to be common
          • Splitting this up into a "should we augment" method first is tempting but I suspect it would result in inefficiency between the "should we" and "make it so" calls. Hacking state in with threadlocals would be clunky, better to leave it a single method.
          • All triggers are "BEFORE" triggers, "AFTER" is tougher because of the batchlog semantics
          • Ultimately it would be nice to have a "real" trigger definition language (possibly pluggable, like postgresql's), but I think that should be a separate ticket; our minimum viable product is, CREATE TRIGGER ON table EXECUTE 'com.my.company.TriggerClass'
          • This ticket should include that for CQL and an equivalent method for Thrift.
          Show
          Jonathan Ellis added a comment - - edited Update: CASSANDRA-4285 is done for Cassandra 1.2. This takes care of the main architectural obstacle for coordinator-based triggers as I outlined above: Triggers would be allowed to turn a write into a batch, or a batch into a modified batch. E.g., in the twissandra example, when a user adds a tweet, you could read his followers list and add an insert into each of their timelines to the batch. So here's what I think could be accomplished fairly easily: public interface ITrigger { public Collection<RowMutation> augment(RowMutation update); } Hook this in to the StorageProxy mutation path; if it returns more than one row, switch to mutateAtomic if we're not already part of an atomic batch. If it returns an empty collection, skip it. Commentary: Each row of a batch would be augmented individually, but all the trigger modifications together would be part of the same final batch. Returning the row unmodified is expected to be common Splitting this up into a "should we augment" method first is tempting but I suspect it would result in inefficiency between the "should we" and "make it so" calls. Hacking state in with threadlocals would be clunky, better to leave it a single method. All triggers are "BEFORE" triggers, "AFTER" is tougher because of the batchlog semantics Ultimately it would be nice to have a "real" trigger definition language (possibly pluggable, like postgresql's), but I think that should be a separate ticket; our minimum viable product is, CREATE TRIGGER ON table EXECUTE 'com.my.company.TriggerClass' This ticket should include that for CQL and an equivalent method for Thrift.
          Hide
          Jeremy Hanna added a comment -

          I think that's CASSANDRA-4285

          Show
          Jeremy Hanna added a comment - I think that's CASSANDRA-4285
          Hide
          Jonathan Ellis added a comment -

          Pulling out the distributed commitlog idea to CASSANRA-4825

          Show
          Jonathan Ellis added a comment - Pulling out the distributed commitlog idea to CASSANRA-4825
          Hide
          Vijay added a comment -

          Pardon my ignorance: I just wanted to throw my view on Triggers (Feel free to ignore it ).

          Today: When a write happens the co-ordinator does n mutations (n being the number of replicas), if any one of the write fails we store hints for the mutation and try again on that node when it comes up.

          With Triggers in the mix, it might be simple to just extend the number of mutations to n+1, where n is the number of replicas and the additional mutation is for the trigger. The trigger implementation should take care of the execution and return success or failure. if there is a failure we will store the hint for the mutation and retry after x interval or something.

          This way we will have a eventually consistent trigger. We dont guarantee the order but we can guarantee the execution which is what we guarantee in most of our operations anyways (Example: we write to n replicas n-1 fails but the other one succeeds we eventually will propagate the data even though the client got a error because of quorum operation).

          We might also add one more consistency level, if we really need to know if the trigger was executed or not.

          Show
          Vijay added a comment - Pardon my ignorance: I just wanted to throw my view on Triggers (Feel free to ignore it ). Today: When a write happens the co-ordinator does n mutations (n being the number of replicas), if any one of the write fails we store hints for the mutation and try again on that node when it comes up. With Triggers in the mix, it might be simple to just extend the number of mutations to n+1, where n is the number of replicas and the additional mutation is for the trigger. The trigger implementation should take care of the execution and return success or failure. if there is a failure we will store the hint for the mutation and retry after x interval or something. This way we will have a eventually consistent trigger. We dont guarantee the order but we can guarantee the execution which is what we guarantee in most of our operations anyways (Example: we write to n replicas n-1 fails but the other one succeeds we eventually will propagate the data even though the client got a error because of quorum operation). We might also add one more consistency level, if we really need to know if the trigger was executed or not.
          Hide
          Brian ONeill added a comment - - edited

          Agreed. I don't think we should include REST in the formal API either, just offering that up as a design pattern for those that need to do more than you can fit in a little javascript snippet.

          We are heavy in performance/stress testing right now. And we now have two models working: one where we use synchronous triggers (prior to write), and another where triggers execute asynchronously after write. Both are useful for different things. (asynch where we can't slow down the actual write – e.g. user interactions, and synch when we need to integrity)

          Additionally, we see a need for two levels of guarantees. For some of the triggers, we don't really care if the trigger failed, because we can rely on a regular map/reduce job to "cleanup" any failed trigger executions. We'd rather not have the overhead of a CSCL even. The system just needs to execute the trigger for us (if it can). If it fails, oh well.

          For other jobs, (synchronous or asynchronous) we need to know when we are in a bad state. i.e. we need to know if the data is ever out of synch with a side-effect of a trigger. For these scenarios, the overhead of the CSCL is acceptable. We can see failed trigger executions even in the event of a crash. (e.g. those log entries left in a PENDING state > some acceptable time period are considered failed and we need to go rectify the situation).

          Unless there are transactional semantics, I think it suffices to have three interception points:

          1. Pre-mutation synchronous (blocking until trigger execution completes)
            • Trigger can add additional mutations
              • (additional columns to a row "in-transaction" seems useful)
            • Trigger can fail the operation
              • (quality/integrity checks)
          2. Post-mutation synchronous
            • Upon failure, we can signal "trigger failure" to the client suggesting retry, but it doesn't fail the actual operation
              • (since its already happened, and we don't want to add rollback)
          3. Post-mutation asynchronous
            • No influence on write (obviously), but need to be guaranteed trigger executes, or know when it has not.

          For each of these, I think there are two levels of guarantees, either:

          1. You don't necessarily care if ALL executions were successful, you'd rather be fast
            • (e.g. statistics / analytics that need to be "close-enough")
          2. You absolutely need to know if data changed and a trigger was unsuccessful in processing that mutation.

          random thoughts,
          -brian

          Show
          Brian ONeill added a comment - - edited Agreed. I don't think we should include REST in the formal API either, just offering that up as a design pattern for those that need to do more than you can fit in a little javascript snippet. We are heavy in performance/stress testing right now. And we now have two models working: one where we use synchronous triggers (prior to write), and another where triggers execute asynchronously after write. Both are useful for different things. (asynch where we can't slow down the actual write – e.g. user interactions, and synch when we need to integrity) Additionally, we see a need for two levels of guarantees. For some of the triggers, we don't really care if the trigger failed, because we can rely on a regular map/reduce job to "cleanup" any failed trigger executions. We'd rather not have the overhead of a CSCL even. The system just needs to execute the trigger for us (if it can). If it fails, oh well. For other jobs, (synchronous or asynchronous) we need to know when we are in a bad state. i.e. we need to know if the data is ever out of synch with a side-effect of a trigger. For these scenarios, the overhead of the CSCL is acceptable. We can see failed trigger executions even in the event of a crash. (e.g. those log entries left in a PENDING state > some acceptable time period are considered failed and we need to go rectify the situation). Unless there are transactional semantics, I think it suffices to have three interception points: Pre-mutation synchronous (blocking until trigger execution completes) Trigger can add additional mutations (additional columns to a row "in-transaction" seems useful) Trigger can fail the operation (quality/integrity checks) Post-mutation synchronous Upon failure, we can signal "trigger failure" to the client suggesting retry, but it doesn't fail the actual operation (since its already happened, and we don't want to add rollback) Post-mutation asynchronous No influence on write (obviously), but need to be guaranteed trigger executes, or know when it has not. For each of these, I think there are two levels of guarantees, either: You don't necessarily care if ALL executions were successful, you'd rather be fast (e.g. statistics / analytics that need to be "close-enough") You absolutely need to know if data changed and a trigger was unsuccessful in processing that mutation. random thoughts, -brian
          Hide
          Jonathan Ellis added a comment -

          all of our triggers we implemented simply make REST posts out to services that actually do the work

          I don't think REST calls should be a first-class citizen in the final api, since a main goal of triggers is to pushing the code closer to the data; calling out over the network cuts that off at the knees. But, obviously a js or jar based trigger could call out to a REST service if that's how you want to roll.

          Show
          Jonathan Ellis added a comment - all of our triggers we implemented simply make REST posts out to services that actually do the work I don't think REST calls should be a first-class citizen in the final api, since a main goal of triggers is to pushing the code closer to the data; calling out over the network cuts that off at the knees. But, obviously a js or jar based trigger could call out to a REST service if that's how you want to roll.
          Hide
          Brian ONeill added a comment -

          Great questions/points. a quick two cents... I'll try to put more thought into it later...

          We had the same concerns RE: classpaths and restarts, etc.
          To keep things simple, all of our triggers we implemented simply make REST posts out to services that actually do the work. This let us get around classpath issues and still have more "power" than a simple script. Not sure this works for everyone though. We're dropwizard happy. It seems like we should provide a js ability as well.

          In our case, we don't need the pre-write CF. I'd prefer having that lightweight option in place. I think the trigger needs the ability to cancel the update. You may be trying to enforce constraints and/or data quality.

          I'm not sure we need to provide them the ability to create additional row mutations in the that same batch unless we have near-term plans for support for transactions. e.g. In the case where the user is keeping an index up to date, they can write to the index themselves in the trigger. As long as they have the ability to fail the insert, this seems reasonable.

          Likewise, altering the actual row mutation might be a nice to have. It doesn't seem critical because the user is free to write additional information in there trigger – agagin, except for the transactional guarantees that a single row write gets you.

          Show
          Brian ONeill added a comment - Great questions/points. a quick two cents... I'll try to put more thought into it later... We had the same concerns RE: classpaths and restarts, etc. To keep things simple, all of our triggers we implemented simply make REST posts out to services that actually do the work. This let us get around classpath issues and still have more "power" than a simple script. Not sure this works for everyone though. We're dropwizard happy. It seems like we should provide a js ability as well. In our case, we don't need the pre-write CF. I'd prefer having that lightweight option in place. I think the trigger needs the ability to cancel the update. You may be trying to enforce constraints and/or data quality. I'm not sure we need to provide them the ability to create additional row mutations in the that same batch unless we have near-term plans for support for transactions. e.g. In the case where the user is keeping an index up to date, they can write to the index themselves in the trigger. As long as they have the ability to fail the insert, this seems reasonable. Likewise, altering the actual row mutation might be a nice to have. It doesn't seem critical because the user is free to write additional information in there trigger – agagin, except for the transactional guarantees that a single row write gets you.
          Hide
          Jonathan Ellis added a comment -

          Here's some brainstorming about things to think through to get this into core:

          • What guarantees can we make about durability? Once a mutation is in any replica of the CSCL it can be read for replay, so it should be considered a success in that respect. But, we can't call it a success for the purposes of the client's request for CL.X yet. In the extreme case we could have a successful CL write but all replicas down. One simple approach that does the right thing most of the time would be to perform the same availability checks on the CSCL replicas as for the data replicas. But, this doesn't address corner cases (nodes going down after the check but before the write), overload situations (nodes being technically up, but timeing out), and also makes the write path more fragile (now we rely on CSCL replicas being up, not just the data nodes).
          • How do we handle replay? We can't simply replay immediately on startup since the CSCL is (probably) on other machines. Do we wait for one CSCL replica? All of them? Do we need to be worried about performance impact of every node in the cluster hammering each other with CSCL requests after a full cluster restart?
          • Do we expose the CSCL to non-trigger uses, e.g. atomic batches?
          • What API do we provide to trigger authors? What points in the write path do we allow hooks into, and what do we allow them to do? (E.g.: cancel the update, modify the RowMutation, create additional RowMutations? Do we provide the pre-write CF row to the trigger? If so do we provide a "lightweight" alternative that doesn't force read-before-write?)
          • What about implementation? "Here's an interface, implement it in whatever JVM language you like and give us a class name?" Appealing, but "now restart your server to get your trigger jar on the classpath" is not. Neither am I thrilled with the thought of implementing some kind of jar manager that stores triggers in Cassandra itself. "Triggers are always implemented in javascript?" Maybe a good lowest-common denominator but many developers are not fond of js and Rhino is a bit of a dog. (Nashorn is due in Java 8, however.)
          Show
          Jonathan Ellis added a comment - Here's some brainstorming about things to think through to get this into core: What guarantees can we make about durability? Once a mutation is in any replica of the CSCL it can be read for replay, so it should be considered a success in that respect. But, we can't call it a success for the purposes of the client's request for CL.X yet. In the extreme case we could have a successful CL write but all replicas down. One simple approach that does the right thing most of the time would be to perform the same availability checks on the CSCL replicas as for the data replicas. But, this doesn't address corner cases (nodes going down after the check but before the write), overload situations (nodes being technically up, but timeing out), and also makes the write path more fragile (now we rely on CSCL replicas being up, not just the data nodes). How do we handle replay? We can't simply replay immediately on startup since the CSCL is (probably) on other machines. Do we wait for one CSCL replica? All of them? Do we need to be worried about performance impact of every node in the cluster hammering each other with CSCL requests after a full cluster restart? Do we expose the CSCL to non-trigger uses, e.g. atomic batches? What API do we provide to trigger authors? What points in the write path do we allow hooks into, and what do we allow them to do? (E.g.: cancel the update, modify the RowMutation, create additional RowMutations? Do we provide the pre-write CF row to the trigger? If so do we provide a "lightweight" alternative that doesn't force read-before-write?) What about implementation? "Here's an interface, implement it in whatever JVM language you like and give us a class name?" Appealing, but "now restart your server to get your trigger jar on the classpath" is not. Neither am I thrilled with the thought of implementing some kind of jar manager that stores triggers in Cassandra itself. "Triggers are always implemented in javascript?" Maybe a good lowest-common denominator but many developers are not fond of js and Rhino is a bit of a dog. (Nashorn is due in Java 8, however.)
          Hide
          Brian ONeill added a comment -

          Jonathan,

          FYI – we implemented both of your suggestions: a single CSCL per host, and switched to use a row per hour approach.

          We released those enhancements today in version 0.15.1.
          https://github.com/hmsonline/cassandra-triggers

          -brian

          Show
          Brian ONeill added a comment - Jonathan, FYI – we implemented both of your suggestions: a single CSCL per host, and switched to use a row per hour approach. We released those enhancements today in version 0.15.1. https://github.com/hmsonline/cassandra-triggers -brian
          Hide
          Brian ONeill added a comment -

          One last thought... (re: more than once processing)
          Thinking a bit more. You're right. I'm assuming we'd get a doInsert() call on each replica. Then, we'd get a trigger invocation for each host managing that keyspace segment. We'd have to incorporate that in to the commit log model, or go with the CSCL per node.

          Show
          Brian ONeill added a comment - One last thought... (re: more than once processing) Thinking a bit more. You're right. I'm assuming we'd get a doInsert() call on each replica. Then, we'd get a trigger invocation for each host managing that keyspace segment. We'd have to incorporate that in to the commit log model, or go with the CSCL per node.
          Hide
          Brian ONeill added a comment -

          re: more than once processing

          Agreed. We have a low write volume right now, which makes it likely that a node can process all of its log entries within N seconds. We were also more concerned that log entries would sit in the commit log unprocessed in the event of a node failure. Thinking about it a little more, maybe we could have written the hash of the row key as a column and query for log entries based on the keyspace segment for which the node is repsonsible.

          re: passing the row mutation

          Also agreed. In fact, the initial implementation wrote the serialized object, but that seemed heavy in the event that large blobs are being written. (duplicating the data in the commit log) Additionally, we wanted to discourage developers from acting on the contents of the row mutation (e.g. updating an index or view), because the datat contained in the mutation may be out of date. (do to the fact that the mutation could have been received out of order) In the second refactoring, we wrote just wrote the column names that were mutated. That seemed sufficient. In its current state, even that is configurable. We just write the keyspace, column family, row and operation that was performed. That way, the commit log is light weight. It is really just a notificaiton that something changed. The rest is up to the trigger.

          Show
          Brian ONeill added a comment - re: more than once processing Agreed. We have a low write volume right now, which makes it likely that a node can process all of its log entries within N seconds. We were also more concerned that log entries would sit in the commit log unprocessed in the event of a node failure. Thinking about it a little more, maybe we could have written the hash of the row key as a column and query for log entries based on the keyspace segment for which the node is repsonsible. re: passing the row mutation Also agreed. In fact, the initial implementation wrote the serialized object, but that seemed heavy in the event that large blobs are being written. (duplicating the data in the commit log) Additionally, we wanted to discourage developers from acting on the contents of the row mutation (e.g. updating an index or view), because the datat contained in the mutation may be out of date. (do to the fact that the mutation could have been received out of order) In the second refactoring, we wrote just wrote the column names that were mutated. That seemed sufficient. In its current state, even that is configurable. We just write the keyspace, column family, row and operation that was performed. That way, the commit log is light weight. It is really just a notificaiton that something changed. The rest is up to the trigger.
          Hide
          Jonathan Ellis added a comment -

          Asynchronously, the commit log is polled and triggers are executed. Upon successful trigger execution, the log entry is removed from the commit log.

          Implementation detail: we probably need to switch between CSCL rows periodically to avoid tombstone pollution.

          Hosts process their own log entries first, then also process any log entries that are older than 5 seconds.

          Doesn't that make "more than once" processing significantly more likely? It also opens up the possibility for a trigger to operate on pre-update data, if the mutation does get delayed longer than N seconds. I think I prefer the CSCL-per-node approach better. (Which also reduces contention on the CSCL rows – with CASSANDRA-2893 done that matters more than it did in <= 1.0.)

          each trigger is idempotent, and acts directly on the data in the column family (rather than data stored in the log entry in the commit log)

          I'm starting to think that's a better design, although it's rather less performant in an already fairly heavyweight design. We should probably pass the CL mutation to the trigger in case that's "good enough" to save it from having to look up the row.

          Show
          Jonathan Ellis added a comment - Asynchronously, the commit log is polled and triggers are executed. Upon successful trigger execution, the log entry is removed from the commit log. Implementation detail: we probably need to switch between CSCL rows periodically to avoid tombstone pollution. Hosts process their own log entries first, then also process any log entries that are older than 5 seconds. Doesn't that make "more than once" processing significantly more likely? It also opens up the possibility for a trigger to operate on pre-update data, if the mutation does get delayed longer than N seconds. I think I prefer the CSCL-per-node approach better. (Which also reduces contention on the CSCL rows – with CASSANDRA-2893 done that matters more than it did in <= 1.0.) each trigger is idempotent, and acts directly on the data in the column family (rather than data stored in the log entry in the commit log) I'm starting to think that's a better design, although it's rather less performant in an already fairly heavyweight design. We should probably pass the CL mutation to the trigger in case that's "good enough" to save it from having to look up the row.
          Hide
          Brian ONeill added a comment - - edited

          FYI – we've released our Cassandra Trigger functionality, which used the crack-smoking as a launch point for design.
          https://github.com/boneill42/cassandra-triggers

          It maintains a column family as a commit log. For each call to .doInsert(), it writes a log entry into the commit log, first with a status of PREPARING. Then when the insert completes, it writes status of COMMITTED. Asynchronously, the commit log is polled and triggers are executed. Upon successful trigger execution, the log entry is removed from the commit log.

          To support distributed processing, each log entry has a host id written along with it. Hosts process their own log entries first, then also process any log entries that are older than 5 seconds. So, order of processing is not guaranteed. This is okay in our situation, because each trigger is idempotent, and acts directly on the data in the column family (rather than data stored in the log entry in the commit log)

          Right now, this is written as AOP, but we'd be happy to refactor to remove the AOP and contribute it as a patch. The jar file is available out in the central repos:
          http://mvnrepository.com/artifact/com.hmsonline/hms-cassandra-triggers

          Show
          Brian ONeill added a comment - - edited FYI – we've released our Cassandra Trigger functionality, which used the crack-smoking as a launch point for design. https://github.com/boneill42/cassandra-triggers It maintains a column family as a commit log. For each call to .doInsert(), it writes a log entry into the commit log, first with a status of PREPARING. Then when the insert completes, it writes status of COMMITTED. Asynchronously, the commit log is polled and triggers are executed. Upon successful trigger execution, the log entry is removed from the commit log. To support distributed processing, each log entry has a host id written along with it. Hosts process their own log entries first, then also process any log entries that are older than 5 seconds. So, order of processing is not guaranteed. This is okay in our situation, because each trigger is idempotent, and acts directly on the data in the column family (rather than data stored in the log entry in the commit log) Right now, this is written as AOP, but we'd be happy to refactor to remove the AOP and contribute it as a patch. The jar file is available out in the central repos: http://mvnrepository.com/artifact/com.hmsonline/hms-cassandra-triggers
          Hide
          Jonathan Ellis added a comment -

          We have published a technical report about the idea of integrating asynchronous triggers into Cassandra [1]

          As described above in my reaction to the patch, the problem with this is that it relies on client-side retries to avoid durable inconsistencies:

          Given the protocol in Figure 4, the failure handling of the master is straight-forward. On reception of a write request,

          if the master fails before sending the acknowledgement to the application (anywhere before step M5 in Figure 4), the task might or might not have been executed. Because the application will not get a response, the application retries the write request (at a di erent node now). Therefore, the trigger will be executed at least once.

          This is a problem for several reasons. First, it pushes part of the complexity back to the client; second, because relying on the client to never crash itself is a poor assumption, and third, because there is no separate client when using the StorageProxy API, as some users do to avoid the extra latency and overhead of Thrift.

          Our solution is to make the storage of "dangling" trigger at replicas also part of log replay. Whenever a replica (slave) node receives a write request, it will durably log that it has to fire a trigger in case of a failure of the update coordinator (master). In case this slave node fails, it will come back up replaying the logs, installing any data item, and also firing triggers.

          I'm less a fan of this than of my crack-smoking commitlog (CSCL for short). Partly because it spreads the complexity outside the coordinator, and partly because it's a less-general tool – the CSCL also solves the inconsistency problem for batches that do not involve triggers as well.

          To guarantee exactly-once semantics of trigger execution, we have build a system that departs significantly from this patch [2].

          I stopped reading when I got to "We implemented a two-phase commit protocol in order to execute distributed transactions." 2PC has well-known problems with coordinator failure. (http://en.wikipedia.org/wiki/Two-phase_commit_protocol#Disadvantages)

          Show
          Jonathan Ellis added a comment - We have published a technical report about the idea of integrating asynchronous triggers into Cassandra [1] As described above in my reaction to the patch, the problem with this is that it relies on client-side retries to avoid durable inconsistencies: Given the protocol in Figure 4, the failure handling of the master is straight-forward. On reception of a write request, if the master fails before sending the acknowledgement to the application (anywhere before step M5 in Figure 4), the task might or might not have been executed. Because the application will not get a response, the application retries the write request (at a di erent node now). Therefore, the trigger will be executed at least once. This is a problem for several reasons. First, it pushes part of the complexity back to the client; second, because relying on the client to never crash itself is a poor assumption, and third, because there is no separate client when using the StorageProxy API, as some users do to avoid the extra latency and overhead of Thrift. Our solution is to make the storage of "dangling" trigger at replicas also part of log replay. Whenever a replica (slave) node receives a write request, it will durably log that it has to fire a trigger in case of a failure of the update coordinator (master). In case this slave node fails, it will come back up replaying the logs, installing any data item, and also firing triggers. I'm less a fan of this than of my crack-smoking commitlog (CSCL for short). Partly because it spreads the complexity outside the coordinator, and partly because it's a less-general tool – the CSCL also solves the inconsistency problem for batches that do not involve triggers as well. To guarantee exactly-once semantics of trigger execution, we have build a system that departs significantly from this patch [2] . I stopped reading when I got to "We implemented a two-phase commit protocol in order to execute distributed transactions." 2PC has well-known problems with coordinator failure. ( http://en.wikipedia.org/wiki/Two-phase_commit_protocol#Disadvantages )
          Hide
          Brian ONeill added a comment -

          Martin, Great stuff. Thanks. I'll read through those reports.

          Because we didn't hear back from the crew, we went ahead and implemented a lightweight trigger mechanism. Using AOP around CassandraServer.doInsert, we write to an event log. Each log entry contains the keyspace, column family, row and columns that were mutated and the operation (isDelete). We then poll that log and execute the triggers off of that, deleting the log entry upon successful invocation of the triggers.

          We are going to release the code as part of Virgil. Still a work in progress, but you can see things here:
          http://code.google.com/a/apache-extras.org/p/virgil/source/browse/#svn%2Ftrunk%2Ftriggers%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fvirgil%2Ftriggers

          Show
          Brian ONeill added a comment - Martin, Great stuff. Thanks. I'll read through those reports. Because we didn't hear back from the crew, we went ahead and implemented a lightweight trigger mechanism. Using AOP around CassandraServer.doInsert, we write to an event log. Each log entry contains the keyspace, column family, row and columns that were mutated and the operation (isDelete). We then poll that log and execute the triggers off of that, deleting the log entry upon successful invocation of the triggers. We are going to release the code as part of Virgil. Still a work in progress, but you can see things here: http://code.google.com/a/apache-extras.org/p/virgil/source/browse/#svn%2Ftrunk%2Ftriggers%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fvirgil%2Ftriggers
          Hide
          Martin Hentschel added a comment -

          Current status from our side: Maxim and I have done research on implementing asynchronous triggers in Cassandra. Based on this, we have provided the patch of this issue. In the discussion that followed, a different implementation of triggers in Cassandra has been suggested. Since then, Maxim and I have not continued to work on our patch. I guess it's up to Jonathan Ellis, T Jake Luciani, and Stu Hood to make a decision on which implementation of triggers to pursue and who does it

          Here is a brief summary of our research on asynchronous triggers in the last 1.5 years:

          We have published a technical report about the idea of integrating asynchronous triggers into Cassandra [1]. In this technical report we provide three things: (1) a comparison of different approaches to execute triggers on top of Cassandra, (2) a set of new protocols for executing triggers inside Cassandra, and (3) results of performance experiments comparing all of these approaches. The proposed protocols ensure at-least-once semantics of trigger execution. That is, upon a change to a table in Cassandra, a registered trigger is guaranteed to be executed at least once – even in case of node failures. The experimental results show that our "integrated approach" performs just as well as the state-of-the-art approach of executing triggers outside of Cassandra. However, our approach utilizes system resources more efficiently; that is, less machines are needed to scale the execution of triggers.

          The patch we have attached to this issue implements the proposed protocols and has been used in our performance experiments. Thus, this patch is fully functional and well-tested. The implementation extends the replication mechanism of Cassandra to execute triggers with at-least-once semantics.

          In the discussion above, another approach has been suggested to implement triggers inside Cassandra. The idea is to use the log record to execute triggers. This seems to be a clever way to implement triggers because it would require less extensions to the existing Cassandra infrastructure, I assume. However, this does not increase the guarantees of trigger execution. It won't give you exactly-once semantics.

          To guarantee exactly-once semantics of trigger execution, we have build a system that departs significantly from this patch [2]. In short, the system extends the patch with transactions and synchronization mechanisms to execute triggers. It also provides an easier programming model, which is more like MapReduce than a trigger-like programming model.

          References:

          [1] Martin Hentschel, Maxim Grinev, Donald Kosssmann: Building Data Flows Using Distributed Key-Value Stores. Technical Report 742, ETH Zurich, ftp://ftp.inf.ethz.ch/pub/publications/tech-reports/7xx/742.pdf

          [2] Maxim Grinev, Maria Grineva, Martin Hentschel, Donald Kossmann: Analytics for the RealTime Web. Demo Paper, VLDB Conference 2011, http://www.vldb.org/pvldb/vol4/p1391-grinev.pdf

          Show
          Martin Hentschel added a comment - Current status from our side: Maxim and I have done research on implementing asynchronous triggers in Cassandra. Based on this, we have provided the patch of this issue. In the discussion that followed, a different implementation of triggers in Cassandra has been suggested. Since then, Maxim and I have not continued to work on our patch. I guess it's up to Jonathan Ellis, T Jake Luciani, and Stu Hood to make a decision on which implementation of triggers to pursue and who does it Here is a brief summary of our research on asynchronous triggers in the last 1.5 years: We have published a technical report about the idea of integrating asynchronous triggers into Cassandra [1] . In this technical report we provide three things: (1) a comparison of different approaches to execute triggers on top of Cassandra, (2) a set of new protocols for executing triggers inside Cassandra, and (3) results of performance experiments comparing all of these approaches. The proposed protocols ensure at-least-once semantics of trigger execution. That is, upon a change to a table in Cassandra, a registered trigger is guaranteed to be executed at least once – even in case of node failures. The experimental results show that our "integrated approach" performs just as well as the state-of-the-art approach of executing triggers outside of Cassandra. However, our approach utilizes system resources more efficiently; that is, less machines are needed to scale the execution of triggers. The patch we have attached to this issue implements the proposed protocols and has been used in our performance experiments. Thus, this patch is fully functional and well-tested. The implementation extends the replication mechanism of Cassandra to execute triggers with at-least-once semantics. In the discussion above, another approach has been suggested to implement triggers inside Cassandra. The idea is to use the log record to execute triggers. This seems to be a clever way to implement triggers because it would require less extensions to the existing Cassandra infrastructure, I assume. However, this does not increase the guarantees of trigger execution. It won't give you exactly-once semantics. To guarantee exactly-once semantics of trigger execution, we have build a system that departs significantly from this patch [2] . In short, the system extends the patch with transactions and synchronization mechanisms to execute triggers. It also provides an easier programming model, which is more like MapReduce than a trigger-like programming model. References: [1] Martin Hentschel, Maxim Grinev, Donald Kosssmann: Building Data Flows Using Distributed Key-Value Stores. Technical Report 742, ETH Zurich, ftp://ftp.inf.ethz.ch/pub/publications/tech-reports/7xx/742.pdf [2] Maxim Grinev, Maria Grineva, Martin Hentschel, Donald Kossmann: Analytics for the RealTime Web. Demo Paper, VLDB Conference 2011, http://www.vldb.org/pvldb/vol4/p1391-grinev.pdf
          Hide
          Brian ONeill added a comment -

          Ditto? Any progress on this?

          I wouldn't mind contributing some time towards it if we have identified a direction.
          (as Jonathan points out, either we do it in Cassandra or we all need to build it on top in that app layer)

          Show
          Brian ONeill added a comment - Ditto? Any progress on this? I wouldn't mind contributing some time towards it if we have identified a direction. (as Jonathan points out, either we do it in Cassandra or we all need to build it on top in that app layer)
          Hide
          David Matter added a comment -

          What is the status of integrating async triggers in Cassandra?

          Show
          David Matter added a comment - What is the status of integrating async triggers in Cassandra?
          Hide
          Edward Capriolo added a comment -

          I think 'crack triggers' sounds great for street cred. I like the approach of writing the trigger to disk. This would allow us to separate the logging of the action with the trigger action which I am guessing would be its stage. This seems like a good tradeoff, space to store the events, and a background thread to replay the events. I am thinking this trigger CF would be ordered so we can replay triggers by time and page through them without too much disk cost.

          Show
          Edward Capriolo added a comment - I think 'crack triggers' sounds great for street cred. I like the approach of writing the trigger to disk. This would allow us to separate the logging of the action with the trigger action which I am guessing would be its stage. This seems like a good tradeoff, space to store the events, and a background thread to replay the events. I am thinking this trigger CF would be ordered so we can replay triggers by time and page through them without too much disk cost.
          Hide
          Martin Hentschel added a comment -

          I like the idea of triggers that are based on the commit log. I haven't fully understood Jonathan's crack smoking above, but basing fault tolerance guarantees on the log certainly is a good idea. Yet, completely different to the patch we have submitted. The question is, how much help is the patch now, and what are the next steps.

          Show
          Martin Hentschel added a comment - I like the idea of triggers that are based on the commit log. I haven't fully understood Jonathan's crack smoking above, but basing fault tolerance guarantees on the log certainly is a good idea. Yet, completely different to the patch we have submitted. The question is, how much help is the patch now, and what are the next steps.
          Hide
          Jonathan Ellis added a comment -

          removetoken could do that automatically

          Show
          Jonathan Ellis added a comment - removetoken could do that automatically
          Hide
          T Jake Luciani added a comment -

          Potential crack smoking ahead:

          I think as long as we keep the original timestamps this should work out...
          The cost of storing the batch in a CF is def prohibitive but I can see at least how it can recover.

          keyed by coordinator node id+; column name some kind of uuid

          If the coordinator dies who will complete the batch? Would you manually need to re-assign the node id to another node?

          Show
          T Jake Luciani added a comment - Potential crack smoking ahead: I think as long as we keep the original timestamps this should work out... The cost of storing the batch in a CF is def prohibitive but I can see at least how it can recover. keyed by coordinator node id+; column name some kind of uuid If the coordinator dies who will complete the batch? Would you manually need to re-assign the node id to another node?
          Hide
          Jonathan Ellis added a comment - - edited

          Potential crack smoking ahead:

          What if we had a "distributed commitlog" for coordinator-based triggers? Triggers would be allowed to turn a write into a batch, or a batch into a modified batch. Triggers would not be allowed to do chains of non-idempotent logic (i.e. anything that can't be represented as a batch). Then coordinator would:

          • write the expanded batch (as serialized blob) to distributed commitlog (keyed by coordinator node id+; column name some kind of uuid) with CL equal to requested for original update
          • apply the batch normally
          • delete the batch DCL blob

          After restarting, the node (or its replacement) should check its commitlog row and re-apply any unfinished batches.

          + In practice this means something like id-per-hour, since Cassandra deals poorly with rows with an unbounded number of tombstones

          Show
          Jonathan Ellis added a comment - - edited Potential crack smoking ahead: What if we had a "distributed commitlog" for coordinator-based triggers? Triggers would be allowed to turn a write into a batch, or a batch into a modified batch. Triggers would not be allowed to do chains of non-idempotent logic (i.e. anything that can't be represented as a batch). Then coordinator would: write the expanded batch (as serialized blob) to distributed commitlog (keyed by coordinator node id+; column name some kind of uuid) with CL equal to requested for original update apply the batch normally delete the batch DCL blob After restarting, the node (or its replacement) should check its commitlog row and re-apply any unfinished batches. + In practice this means something like id-per-hour, since Cassandra deals poorly with rows with an unbounded number of tombstones
          Hide
          Jonathan Ellis added a comment -

          The big minus for the replica level triggers is that no one really wants to get N triggers

          That's because, as I've said before, this only really makes sense to me for user-level data once you have entity groups.

          Coordinator-level triggers are an ugly ball of corner cases that add no functionality over what you can do with a well-designed app-level storage layer. It's an idea that is superficially attractive but is a non-starter once you dig deeper.

          Show
          Jonathan Ellis added a comment - The big minus for the replica level triggers is that no one really wants to get N triggers That's because, as I've said before, this only really makes sense to me for user-level data once you have entity groups. Coordinator-level triggers are an ugly ball of corner cases that add no functionality over what you can do with a well-designed app-level storage layer. It's an idea that is superficially attractive but is a non-starter once you dig deeper.
          Hide
          Edward Capriolo added a comment - - edited

          IRC: it's fundamentally different to have triggers that leave things permanently inconsistent, vs triggers that are as exactly as eventually consistent as the write they are part of

          Agreed. I still see + or minuses for both approaches.
          The big minus for the replica level triggers is that no one really wants to get N triggers. Which is why this ticket is stuck in the mud. (see that gets ugly below)

          I see triggers as an event not a storage acknowledgement. For example, if I receive the coordinator-trigger "WRITE QUORUM KEY X COLUMN Y VALUE Z SUCCESS". I can now now attempt to READ key X COLUMN Y and get VALUE Z (or later). This is enormously useful.

          Now if I receive one replica-trigger. "WRITE KEY X COLUMN Y VALUE Z SUCCESS". Now what? First, I do not know the CL of the write. I really do not even know the replication factor unless it has in intelligence. Without this information I can not be sure if I can read WRITE KEY X COLUMN Y and get VALUE Z. Not useful. (well since i probably know which server the trigger came from I can connect to that specific node and get that data, but this hurts the symmetry of the read from any node capability of cassandra)

          (that gets ugly)
          The replica-triggers is going to be is it's own fairly complex state machine and complex requirements.
          1) It has to be able to accept writes as fast as cassandra. Maybe replication_factor times as fast because all three triggers have to land on the same server to make any assessment about the state of the system.
          2) It needs to buffer and hold triggers until it receive replication_factor of them
          3) This just feels beastly.

          Show
          Edward Capriolo added a comment - - edited IRC: it's fundamentally different to have triggers that leave things permanently inconsistent, vs triggers that are as exactly as eventually consistent as the write they are part of Agreed. I still see + or minuses for both approaches. The big minus for the replica level triggers is that no one really wants to get N triggers. Which is why this ticket is stuck in the mud. (see that gets ugly below) I see triggers as an event not a storage acknowledgement. For example, if I receive the coordinator-trigger "WRITE QUORUM KEY X COLUMN Y VALUE Z SUCCESS". I can now now attempt to READ key X COLUMN Y and get VALUE Z (or later). This is enormously useful. Now if I receive one replica-trigger. "WRITE KEY X COLUMN Y VALUE Z SUCCESS". Now what? First, I do not know the CL of the write. I really do not even know the replication factor unless it has in intelligence. Without this information I can not be sure if I can read WRITE KEY X COLUMN Y and get VALUE Z. Not useful. (well since i probably know which server the trigger came from I can connect to that specific node and get that data, but this hurts the symmetry of the read from any node capability of cassandra) (that gets ugly) The replica-triggers is going to be is it's own fairly complex state machine and complex requirements. 1) It has to be able to accept writes as fast as cassandra. Maybe replication_factor times as fast because all three triggers have to land on the same server to make any assessment about the state of the system. 2) It needs to buffer and hold triggers until it receive replication_factor of them 3) This just feels beastly.
          Hide
          Jonathan Ellis added a comment -

          Fundamentally there is zero value to an app developer from coordinator-run triggers vs running the same code from a well-designed app-side storage layer. This is why I don't think it's worth adding a bunch of fragile scaffolding around it to try to make it kinda halfway work.

          But replica-level triggers are not simply moving logic from the client to the coordinator. That makes it more interesting, as well as trivially sound via the commitlog.

          Show
          Jonathan Ellis added a comment - Fundamentally there is zero value to an app developer from coordinator-run triggers vs running the same code from a well-designed app-side storage layer. This is why I don't think it's worth adding a bunch of fragile scaffolding around it to try to make it kinda halfway work. But replica-level triggers are not simply moving logic from the client to the coordinator. That makes it more interesting, as well as trivially sound via the commitlog.
          Hide
          Edward Capriolo added a comment -

          I think the work to pick a trigger master is really awesome. Great job, hard problem to solve. I think different people want different things from triggers.

          The 'internal developers' might want to be able to use triggers to build secondary indexes. They want triggers to happen close to the storage layer in most cases.

          The 'external users' might want to be able to use triggers to replicate data to another system. They want triggers to be on the coordinating node.

          What I am proposing is we recognize this and implement two types of triggers. The trigger for 'external users' is the easiest and should be done first.

          ExternalTriggers have a PreHook and PostHook. External triggers happen on the coordinating node. For the post hook, if an operation succeeds at the client specified consistency level a success_trigger fires. If it does not succeed a failed_trigger fires.

          The weirdness in external triggers comes from a write timing out on some but not all nodes. That write could succeed on some nodes but not on all. Read repair and hinted handoff could eventually fix this data. This would result in the failed trigger firing but the write eventually succeeding. I argue that this behaviour is undefined. Clients are "supposed" to replay failed writes until success.

          Now the fun part, since we can fire a trigger on both success and failure we could theoretically deal with the above weirdness by implementing a guaranteed hinted handoff trigger. This would be done by using the PostHook failed_trigger to delay and then retry the write.

          Show
          Edward Capriolo added a comment - I think the work to pick a trigger master is really awesome. Great job, hard problem to solve. I think different people want different things from triggers. The 'internal developers' might want to be able to use triggers to build secondary indexes. They want triggers to happen close to the storage layer in most cases. The 'external users' might want to be able to use triggers to replicate data to another system. They want triggers to be on the coordinating node. What I am proposing is we recognize this and implement two types of triggers. The trigger for 'external users' is the easiest and should be done first. ExternalTriggers have a PreHook and PostHook. External triggers happen on the coordinating node. For the post hook, if an operation succeeds at the client specified consistency level a success_trigger fires. If it does not succeed a failed_trigger fires. The weirdness in external triggers comes from a write timing out on some but not all nodes. That write could succeed on some nodes but not on all. Read repair and hinted handoff could eventually fix this data. This would result in the failed trigger firing but the write eventually succeeding. I argue that this behaviour is undefined. Clients are "supposed" to replay failed writes until success. Now the fun part, since we can fire a trigger on both success and failure we could theoretically deal with the above weirdness by implementing a guaranteed hinted handoff trigger. This would be done by using the PostHook failed_trigger to delay and then retry the write.
          Hide
          T Jake Luciani added a comment -

          Hi Martin,

          I'm interested in learning more about your new approach. I think the concerns over the current approach are summed up by:

          This sounds really, really fragile. Grafting a pseudo-master onto Cassandra replication is a bad idea.

          I think for version 1 of triggers we need to go with the simplest approach. I believe this is to make the triggers synchronous and run RF times. Another possibility is to make the coordinator node responsible for running/delegating the trigger before acknowledging a successful write...

          As we bake out new ideas we can look at improving this, but I do think a simple trigger mechanism is needed for 1.0.

          Show
          T Jake Luciani added a comment - Hi Martin, I'm interested in learning more about your new approach. I think the concerns over the current approach are summed up by: This sounds really, really fragile. Grafting a pseudo-master onto Cassandra replication is a bad idea. I think for version 1 of triggers we need to go with the simplest approach. I believe this is to make the triggers synchronous and run RF times. Another possibility is to make the coordinator node responsible for running/delegating the trigger before acknowledging a successful write... As we bake out new ideas we can look at improving this, but I do think a simple trigger mechanism is needed for 1.0.
          Hide
          Martin Hentschel added a comment -

          Hi Steve,

          Yes, this thread has been silent for quiet some time though we still believe in our ideas. For research purposes, we evaluated our approach of asynchronous triggers against the state-of-the-art approach of buffering work items in some queue outside of Cassandra. Our main findings were the following:

          • Our approach requires 15% less machines to perform the same tasks and reduces overall network traffic by 30%. That is because all work is performed inside Cassandra instead of outside of Cassandra involving more machines to host the queue and worker threads that execute work items.
          • Our approach gives you the program interface of triggers, which makes it easier to program applications than writing code to store work items in the queue and to execute work items with at-least-once semantics.

          We now enhanced our ideas towards exactly-once semantics (which were also favored by Jonathan Ellis in this thread) and an even easier MapReduce-like programming model. Though, for research purposes, our prototype is not based on Cassandra anymore. If there is enough interest in our new ideas, we think it can be easily ported to Cassandra. An in-depth description will be published soon at the VLDB conference ("Analytics for the Real-Time Web", http://www.vldb.org/2011/?q=node/23).

          Thanks,
          Martin

          Show
          Martin Hentschel added a comment - Hi Steve, Yes, this thread has been silent for quiet some time though we still believe in our ideas. For research purposes, we evaluated our approach of asynchronous triggers against the state-of-the-art approach of buffering work items in some queue outside of Cassandra. Our main findings were the following: Our approach requires 15% less machines to perform the same tasks and reduces overall network traffic by 30%. That is because all work is performed inside Cassandra instead of outside of Cassandra involving more machines to host the queue and worker threads that execute work items. Our approach gives you the program interface of triggers, which makes it easier to program applications than writing code to store work items in the queue and to execute work items with at-least-once semantics. We now enhanced our ideas towards exactly-once semantics (which were also favored by Jonathan Ellis in this thread) and an even easier MapReduce-like programming model. Though, for research purposes, our prototype is not based on Cassandra anymore. If there is enough interest in our new ideas, we think it can be easily ported to Cassandra. An in-depth description will be published soon at the VLDB conference ("Analytics for the Real-Time Web", http://www.vldb.org/2011/?q=node/23 ). Thanks, Martin
          Hide
          steve selfors added a comment -

          Hello, this thread looks quiet, but it's interesting for me. I'm looking at Cassandra as a DB option for a new application. I'm the product owner in agile parlance (and business owner) and I could not agree more with Maxim's view that there are use critical use cases for asynchronous triggers. We need to push notifications of changes to column families (tables) from one node to other nodes, not poll for them. Polling is a disaster. I do not understand all of the back and forth on this trail, but I can tell you, that there are clear use cases a) for triggers and b) for low latency!. Latency is the enemy so anything we can do to mitigate is of high interest. It appears that Maxim's goal was to implement triggers with ultra low latency. Putting this on the shelf is a bummer for me.

          Show
          steve selfors added a comment - Hello, this thread looks quiet, but it's interesting for me. I'm looking at Cassandra as a DB option for a new application. I'm the product owner in agile parlance (and business owner) and I could not agree more with Maxim's view that there are use critical use cases for asynchronous triggers. We need to push notifications of changes to column families (tables) from one node to other nodes, not poll for them. Polling is a disaster. I do not understand all of the back and forth on this trail, but I can tell you, that there are clear use cases a) for triggers and b) for low latency!. Latency is the enemy so anything we can do to mitigate is of high interest. It appears that Maxim's goal was to implement triggers with ultra low latency. Putting this on the shelf is a bummer for me.
          Hide
          Jonathan Ellis added a comment -

          Our solution is to make the storage of "dangling" trigger at replicas also part of log replay. Whenever a replica (slave) node receives a write request, it will durably log that it has to fire a trigger in case of a failure of the update coordinator (master). In case this slave node fails, it will come back up replaying the logs, installing any data item, and also firing triggers.

          This sounds really, really fragile. Grafting a pseudo-master onto Cassandra replication is a bad idea.

          Show
          Jonathan Ellis added a comment - Our solution is to make the storage of "dangling" trigger at replicas also part of log replay. Whenever a replica (slave) node receives a write request, it will durably log that it has to fire a trigger in case of a failure of the update coordinator (master). In case this slave node fails, it will come back up replaying the logs, installing any data item, and also firing triggers. This sounds really, really fragile. Grafting a pseudo-master onto Cassandra replication is a bad idea.
          Hide
          Maxim Grinev added a comment -

          Jonathan, are we right that you are mainly concerned about the inconsistency problem we were discussing earlier?

          To repeat, the problem was that if a write has not been acknowledged, it may still have succeeded in the base columnfamily (Table.apply) but the trigger is stored durably (TriggerSlave.storeDanglingTrigger). It is because trigger execution is not part of log replay. So we can end up with data that never had the trigger fire.

          As Stu said, using entity groups to solve this problem is suboptimal. We want to propose an improvement of our approach that also solves this problem but does not introduce the restrictions of entity groups.

          Our solution is to make the storage of "dangling" trigger at replicas also part of log replay. Whenever a replica (slave) node receives a write request, it will durably log that it has to fire a trigger in case of a failure of the update coordinator (master). In case this slave node fails, it will come back up replaying the logs, installing any data item, and also firing triggers.

          Once again, our point is to avoid entity group whenever possible for the following two reasons. (1) Entity groups are restrictive because they require co-location of data. This is hard to achieve in practice and not all application can be implemented this. For example, it is not possible to implement Twitter as it would require to co-locate users with their followers ending up with a single, big entity group. (2) Even in case when you can use entity groups, you usually need to update redundant data stored in other entity groups as well. In this case, triggers may be used to keep redundant data consistent across entity groups.

          Show
          Maxim Grinev added a comment - Jonathan, are we right that you are mainly concerned about the inconsistency problem we were discussing earlier? To repeat, the problem was that if a write has not been acknowledged, it may still have succeeded in the base columnfamily (Table.apply) but the trigger is stored durably (TriggerSlave.storeDanglingTrigger). It is because trigger execution is not part of log replay. So we can end up with data that never had the trigger fire. As Stu said, using entity groups to solve this problem is suboptimal. We want to propose an improvement of our approach that also solves this problem but does not introduce the restrictions of entity groups. Our solution is to make the storage of "dangling" trigger at replicas also part of log replay. Whenever a replica (slave) node receives a write request, it will durably log that it has to fire a trigger in case of a failure of the update coordinator (master). In case this slave node fails, it will come back up replaying the logs, installing any data item, and also firing triggers. Once again, our point is to avoid entity group whenever possible for the following two reasons. (1) Entity groups are restrictive because they require co-location of data. This is hard to achieve in practice and not all application can be implemented this. For example, it is not possible to implement Twitter as it would require to co-locate users with their followers ending up with a single, big entity group. (2) Even in case when you can use entity groups, you usually need to update redundant data stored in other entity groups as well. In this case, triggers may be used to keep redundant data consistent across entity groups.
          Hide
          Jonathan Ellis added a comment -

          To be clear: I am -1 on coordinator-based triggers.

          Show
          Jonathan Ellis added a comment - To be clear: I am -1 on coordinator-based triggers.
          Hide
          Stu Hood added a comment -

          I believe that the entity group functionality can actually be implemented as sugar on top of our existing secondary indexes and column nesting (I'll describe what I'm thinking of on that ticket).

          This ticket, on the other hand, provides novel functionality, so my vote is strongly in favor of getting it in mostly-as-is, with slight adjustments:

          • ITriggers should be configured to match a filter, as described on CASSANDRA-1601. Rather than triggering for any change to a row, it should trigger for changes to a set of names, or a slice of names. For example: setting a trigger for columns 'age' and 'state' would only fire the trigger if either of those columns changed
          • The ITrigger contract should not guarantee to give the user all changed columns: only the columns matching the trigger configuration.

          I'm hoping that in future tickets we can unify these ''at-least-once'' distributed triggers with our ''exactly-once'' local indexes (with UDFs) around common configuration: see CASSANDRA-1601.

          Show
          Stu Hood added a comment - I believe that the entity group functionality can actually be implemented as sugar on top of our existing secondary indexes and column nesting (I'll describe what I'm thinking of on that ticket). This ticket, on the other hand, provides novel functionality, so my vote is strongly in favor of getting it in mostly-as-is, with slight adjustments: ITriggers should be configured to match a filter, as described on CASSANDRA-1601 . Rather than triggering for any change to a row, it should trigger for changes to a set of names, or a slice of names. For example: setting a trigger for columns 'age' and 'state' would only fire the trigger if either of those columns changed The ITrigger contract should not guarantee to give the user all changed columns: only the columns matching the trigger configuration. I'm hoping that in future tickets we can unify these ''at-least-once'' distributed triggers with our ''exactly-once'' local indexes (with UDFs) around common configuration: see CASSANDRA-1601 .
          Hide
          Stu Hood added a comment -

          I haven't really seen a good example of where entity groups are not just a pain in the user's rear end: essentially, we are breaking down and asking users to perform their own partitioning so that we can give them strong consistency between two dimensions that are normally eventually consistent.

          Not a strong feeling, just something that's been nagging at me.

          Show
          Stu Hood added a comment - I haven't really seen a good example of where entity groups are not just a pain in the user's rear end: essentially, we are breaking down and asking users to perform their own partitioning so that we can give them strong consistency between two dimensions that are normally eventually consistent. Not a strong feeling, just something that's been nagging at me.
          Hide
          Jonathan Ellis added a comment -

          It's not the synchronicity per se that's interesting, so much as the part about making them per-replica rather than per-coordinator so we can guarantee consistency via the commitlog.

          Show
          Jonathan Ellis added a comment - It's not the synchronicity per se that's interesting, so much as the part about making them per-replica rather than per-coordinator so we can guarantee consistency via the commitlog.
          Hide
          Maxim Grinev added a comment -

          Jonathan, Jake, if you just mean synchronous triggers we can easily add an option to make it synchronous. To make it synchronous we just need to switch off failover mechanism and don't reply to the client until the trigger is executed (it will be executed asynchronously but wait for results as Jake wrote). Is it what you mean or you mean something more?

          Show
          Maxim Grinev added a comment - Jonathan, Jake, if you just mean synchronous triggers we can easily add an option to make it synchronous. To make it synchronous we just need to switch off failover mechanism and don't reply to the client until the trigger is executed (it will be executed asynchronously but wait for results as Jake wrote). Is it what you mean or you mean something more?
          Hide
          T Jake Luciani added a comment -

          I think Jonathan is referring to the server side ITrigger impl as the application. The ITrigger is called synchronously for every write but the implementation of the trigger can be asynchronous, e.g. implement a linked blocking queue and thread pool to do the work asynchronously.

          Show
          T Jake Luciani added a comment - I think Jonathan is referring to the server side ITrigger impl as the application. The ITrigger is called synchronously for every write but the implementation of the trigger can be asynchronous, e.g. implement a linked blocking queue and thread pool to do the work asynchronously.
          Hide
          Martin Hentschel added a comment -

          Thanks for your comments. I don't want to advertise our approach and implementation but I want to give some hints on why we chose this pattern.

          T Jake Luciani said that clients want to know when triggers are executed. In our approach the client can be sure that the trigger will be executed (at-least-once behavior) but not when. There are lots of use cases when a client doesn't need to know when the trigger was executed. For example, when you do a bank transaction, you usually submit the transaction and it is executed some time later. On Twitter, when you submit a tweet you usually do not care when your tweet is forwarded to your followers. In both cases, the execution of the bank transaction and forwarding the tweet to your followers, would be implemented by triggers.

          Jonathan Ellis said that asynchronous behavior can be implemented on the client side. While this is true, it requires some effort by the programmer. The programmer would need to set up a queue, implement communication with the queue, ensure at-least-once behavior etc. Having all of this already present within Cassandra could help in the use cases mentioned above.

          In my opinion, having synchronous triggers has no advantages over implementing it synchronously at the client. Synchronously updating a table + an index can be done easily at the client side (without queues and fancy implementation). Therefore we wouldn't need triggers at all.

          I guess I will join the dev list again, this seems to become a good discussion

          Show
          Martin Hentschel added a comment - Thanks for your comments. I don't want to advertise our approach and implementation but I want to give some hints on why we chose this pattern. T Jake Luciani said that clients want to know when triggers are executed. In our approach the client can be sure that the trigger will be executed (at-least-once behavior) but not when. There are lots of use cases when a client doesn't need to know when the trigger was executed. For example, when you do a bank transaction, you usually submit the transaction and it is executed some time later. On Twitter, when you submit a tweet you usually do not care when your tweet is forwarded to your followers. In both cases, the execution of the bank transaction and forwarding the tweet to your followers, would be implemented by triggers. Jonathan Ellis said that asynchronous behavior can be implemented on the client side. While this is true, it requires some effort by the programmer. The programmer would need to set up a queue, implement communication with the queue, ensure at-least-once behavior etc. Having all of this already present within Cassandra could help in the use cases mentioned above. In my opinion, having synchronous triggers has no advantages over implementing it synchronously at the client. Synchronously updating a table + an index can be done easily at the client side (without queues and fancy implementation). Therefore we wouldn't need triggers at all. I guess I will join the dev list again, this seems to become a good discussion
          Hide
          Jonathan Ellis added a comment -

          Right. "Asynchronous triggers in the coordinator" has very little to recommend it over "just do it in your app with the StorageProxy API" (in which case your app is the coordinator, and you can easily run whatever coordinator-side logic you like).

          Show
          Jonathan Ellis added a comment - Right. "Asynchronous triggers in the coordinator" has very little to recommend it over "just do it in your app with the StorageProxy API" (in which case your app is the coordinator, and you can easily run whatever coordinator-side logic you like).
          Hide
          T Jake Luciani added a comment -

          "enforce trigger consistency via the commitlog the way 2ary indexes currently do"

          I agree we should mirror the 2ary index impl, which means it becomes a synchronous operation. Frankly I think that's ok since it's the only way to know the client will know the data was written AND the trigger was executed. If a user wants asynchronous behavior, then the trigger implementation can simply schedule a action in a local queue.

          Show
          T Jake Luciani added a comment - "enforce trigger consistency via the commitlog the way 2ary indexes currently do" I agree we should mirror the 2ary index impl, which means it becomes a synchronous operation. Frankly I think that's ok since it's the only way to know the client will know the data was written AND the trigger was executed. If a user wants asynchronous behavior, then the trigger implementation can simply schedule a action in a local queue.
          Hide
          Jonathan Ellis added a comment -

          Of course, there is a window in which triggers have been fired but the actual data is not present. If the client ceases to issue the write request, then the inconsistency will be durable, which is bad

          Right. This is the biggest valid technical complaint about Cassandra right now – so this design isn't worse than the status quo, but I'd like it to be better.

          If we said "triggers have to happen within an entity group" (CASSANDRA-1684) then we could

          • have each replica node process triggers independently, w/o the coordinator being involved
          • enforce trigger consistency via the commitlog the way 2ary indexes currently do
          Show
          Jonathan Ellis added a comment - Of course, there is a window in which triggers have been fired but the actual data is not present. If the client ceases to issue the write request, then the inconsistency will be durable, which is bad Right. This is the biggest valid technical complaint about Cassandra right now – so this design isn't worse than the status quo, but I'd like it to be better . If we said "triggers have to happen within an entity group" ( CASSANDRA-1684 ) then we could have each replica node process triggers independently, w/o the coordinator being involved enforce trigger consistency via the commitlog the way 2ary indexes currently do
          Hide
          Martin Hentschel added a comment -

          In StorageProxy.mutate (was mutateBlocking in the older version I guess), triggers are only executed once a write quorum has been reached. Threrefore, at the master, only acknowledged writes will also fire triggers. If the acknowledgement gets lost, then the writes have been performed and the triggers have been executed so there should not be a consistency issue.

          Concerning your point about slave nodes, we assume that clients retry write requests that have not been acknowledged by Cassandra. Triggers at slave nodes are buffered (storeDanglingTrigger) and are only executed in case the master node goes down. If the master fails before sending the acknowledgement to the client, the trigger might have been executed already. I think that correctly was your point. But because the client did not get a response, it should retry its write request and thus establishing consistency again. Of course, there is a window in which triggers have been fired but the actual data is not present. If the client ceases to issue the write request, then the inconsistency will be durable, which is bad. If this is a too serious issue, we should come up with a solution.

          (For the record:
          Master node = Cassandra's update coordinator
          Slave nodes = All replica nodes minus the update coordinator)

          Show
          Martin Hentschel added a comment - In StorageProxy.mutate (was mutateBlocking in the older version I guess), triggers are only executed once a write quorum has been reached. Threrefore, at the master, only acknowledged writes will also fire triggers. If the acknowledgement gets lost, then the writes have been performed and the triggers have been executed so there should not be a consistency issue. Concerning your point about slave nodes, we assume that clients retry write requests that have not been acknowledged by Cassandra. Triggers at slave nodes are buffered (storeDanglingTrigger) and are only executed in case the master node goes down. If the master fails before sending the acknowledgement to the client, the trigger might have been executed already. I think that correctly was your point. But because the client did not get a response, it should retry its write request and thus establishing consistency again. Of course, there is a window in which triggers have been fired but the actual data is not present. If the client ceases to issue the write request, then the inconsistency will be durable, which is bad. If this is a too serious issue, we should come up with a solution. (For the record: Master node = Cassandra's update coordinator Slave nodes = All replica nodes minus the update coordinator)
          Hide
          Jonathan Ellis added a comment -

          if the write has been acknowledged, the client can be sure the trigger will be executed (at least once)

          the problem is with the inverse scenario:

          if a write has not been acknowledged, it may still have succeeded in the base columnfamily (Table.apply) but not the trigger (TriggerSlave.storeDanglingTrigger). So, because [from my look at the code at least] trigger execution is not part of log replay, you can end up with data that never had the trigger fire.

          In this situation, the client will see a TimedOutException, which it is important to remember is NOT the same as failure; it means instead "we don't know what happened."

          Show
          Jonathan Ellis added a comment - if the write has been acknowledged, the client can be sure the trigger will be executed (at least once) the problem is with the inverse scenario: if a write has not been acknowledged, it may still have succeeded in the base columnfamily (Table.apply) but not the trigger (TriggerSlave.storeDanglingTrigger). So, because [from my look at the code at least] trigger execution is not part of log replay, you can end up with data that never had the trigger fire. In this situation, the client will see a TimedOutException, which it is important to remember is NOT the same as failure; it means instead "we don't know what happened."
          Hide
          Martin Hentschel added a comment -

          No, wait, there seems to be a misconception here.

          Whenever there is a write to a column family for which a trigger has been defined, that trigger is guaranteed to be executed at least once. For that, we use the same mechanics as any replicated write to a column family. In Cassandra, a write is not acknowledged if the write quorum cannot be met. If a write is acknowledged though, the client can be sure the write is durable. The same holds for our trigger framework: if the write has been acknowledged, the client can be sure the trigger will be executed (at least once). As we see it, there is no window for permanent inconsistency. If you could provide us with an example of such a window, we would be more than happy to think about it and make changes to our code as necessary.

          Having at-least-once semantics requires triggers to be implemented idempotently. Exactly-once semantics of trigger execution would only allow triggers to not be implemented idempotently. This, of course, has advantages but does not mean that triggers are executed in a more reliable/guaranteed way.

          Show
          Martin Hentschel added a comment - No, wait, there seems to be a misconception here. Whenever there is a write to a column family for which a trigger has been defined, that trigger is guaranteed to be executed at least once. For that, we use the same mechanics as any replicated write to a column family. In Cassandra, a write is not acknowledged if the write quorum cannot be met. If a write is acknowledged though, the client can be sure the write is durable. The same holds for our trigger framework: if the write has been acknowledged, the client can be sure the trigger will be executed (at least once). As we see it, there is no window for permanent inconsistency. If you could provide us with an example of such a window, we would be more than happy to think about it and make changes to our code as necessary. Having at-least-once semantics requires triggers to be implemented idempotently. Exactly-once semantics of trigger execution would only allow triggers to not be implemented idempotently. This, of course, has advantages but does not mean that triggers are executed in a more reliable/guaranteed way.
          Hide
          Jonathan Ellis added a comment - - edited

          The replicas that are updated will take care of that

          Relying on replication + repair leaves a window for permanent inconsistency when power failure occurs, or even shutdown of a coordinator node at the right [wrong] time, where the base table has been written but the replication messages haven't made it onto the network and the triggers have not been fired. So it may be appropriate for some use cases but it is definitely less powerful than guaranteed trigger invocation.

          Show
          Jonathan Ellis added a comment - - edited The replicas that are updated will take care of that Relying on replication + repair leaves a window for permanent inconsistency when power failure occurs, or even shutdown of a coordinator node at the right [wrong] time, where the base table has been written but the replication messages haven't made it onto the network and the triggers have not been fired. So it may be appropriate for some use cases but it is definitely less powerful than guaranteed trigger invocation.
          Hide
          Maxim Grinev added a comment -

          The implementation guarantees that triggers will be executed at least once even if the update is only partially executed. The replicas that are updated will take care of that. It means that if a write updates some replica and the write coordinator crashed before executing triggers and acknowledging the client the triggers will be executed (as many times as the number of replicas were updated). So missing an update is not a problem. Triggers are a good solution for indexing. The only thing that is triggers are not good for is where the trigger procedure is not idempontent. For example, when a trigger increments a counter, the counter will be incremented with the same value more than once if the write coordinator failed.

          Show
          Maxim Grinev added a comment - The implementation guarantees that triggers will be executed at least once even if the update is only partially executed. The replicas that are updated will take care of that. It means that if a write updates some replica and the write coordinator crashed before executing triggers and acknowledging the client the triggers will be executed (as many times as the number of replicas were updated). So missing an update is not a problem. Triggers are a good solution for indexing. The only thing that is triggers are not good for is where the trigger procedure is not idempontent. For example, when a trigger increments a counter, the counter will be incremented with the same value more than once if the write coordinator failed.
          Hide
          David Erickson added a comment - - edited

          I haven't dug into this implementation of triggers, but a use case could be using Cassandra as a shared communication bus amongst distributed nodes. If node 1 makes a change to its Cassandra instance, the data then propagates to the other Cassandra instances, and the triggers alert other nodes that a change has been made and they need to do some processing, which is better than polling for the same changes. Alternatively the nodes would have to have their own protocol to alert each other outside of the database layer that changes have been made.

          Show
          David Erickson added a comment - - edited I haven't dug into this implementation of triggers, but a use case could be using Cassandra as a shared communication bus amongst distributed nodes. If node 1 makes a change to its Cassandra instance, the data then propagates to the other Cassandra instances, and the triggers alert other nodes that a change has been made and they need to do some processing, which is better than polling for the same changes. Alternatively the nodes would have to have their own protocol to alert each other outside of the database layer that changes have been made.
          Hide
          Jonathan Ellis added a comment -

          i take it this is best-suited for triggers that are

          • invoked frequently
          • idempotent

          so that missing an update once in a while (because of a server restart after row update but before trigger processing) is not a big deal?

          what use cases do we have for this? indexing actually does not fit this description, since if you miss updating an index row for changing the value from 4 to 6, firing the trigger for the change from 6 to 8 will not fix the invalid index entry for 4.

          Show
          Jonathan Ellis added a comment - i take it this is best-suited for triggers that are invoked frequently idempotent so that missing an update once in a while (because of a server restart after row update but before trigger processing) is not a big deal? what use cases do we have for this? indexing actually does not fit this description, since if you miss updating an index row for changing the value from 4 to 6, firing the trigger for the change from 6 to 8 will not fix the invalid index entry for 4.
          Hide
          Martin Hentschel added a comment -

          I fixed the points Stu mentioned:

          • untangled findTriggers
          • using MD5 instead of Arrays.hashCode
          • storing dangling triggers in the System Table instead of Java collections

          I want to note that these fixes have introduce some performance penalties. In our benchmarks we observed a performance loss of 22%. According to Maxim who has discussed this point with Stu at the summit, this is still fine. The System Table provides additional durability guarantees instead.

          Show
          Martin Hentschel added a comment - I fixed the points Stu mentioned: untangled findTriggers using MD5 instead of Arrays.hashCode storing dangling triggers in the System Table instead of Java collections I want to note that these fixes have introduce some performance penalties. In our benchmarks we observed a performance loss of 22%. According to Maxim who has discussed this point with Stu at the summit, this is still fine. The System Table provides additional durability guarantees instead.
          Hide
          Maxim Grinev added a comment -

          > An interesting benefit provided by CASSANDRA-1016 is that it has a pre-execution step, allowing the Plugin/Trigger to perform a read if it needs to (for example, to update a compound index which the mutation does not contain all columns for).

          Nothing prevents you from querying Cassandra from within a trigger. It is not atomic as you mentioned. Having all information ready in the mutation will be atomic (and faster because you avoid the query). But once again you can still query in the same way as you update the data from within a trigger.

          Show
          Maxim Grinev added a comment - > An interesting benefit provided by CASSANDRA-1016 is that it has a pre-execution step, allowing the Plugin/Trigger to perform a read if it needs to (for example, to update a compound index which the mutation does not contain all columns for). Nothing prevents you from querying Cassandra from within a trigger. It is not atomic as you mentioned. Having all information ready in the mutation will be atomic (and faster because you avoid the query). But once again you can still query in the same way as you update the data from within a trigger.
          Hide
          Maxim Grinev added a comment -

          >> Faster storage of "dangling" triggers at a slave node

          > I mentioned to Maxim at the summit that I think persisting the dangling triggers to the system tables is a good idea...

          Thanks for your comments. We agree with all the points and will send a new patch soon.

          Show
          Maxim Grinev added a comment - >> Faster storage of "dangling" triggers at a slave node > I mentioned to Maxim at the summit that I think persisting the dangling triggers to the system tables is a good idea... Thanks for your comments. We agree with all the points and will send a new patch soon.
          Hide
          Jonathan Ellis added a comment - - edited

          A quick glance at the implementation looks like its doing Slave hashing based only on its IP address, would this cause problems with hosts running multiple Cassandra instances on a single IP address?

          You already can't have multiple Cassandra instances on a single IP.

          [the original comment I replied to was deleted, so to avoid losing context, I've edited mine to include it.]

          Show
          Jonathan Ellis added a comment - - edited A quick glance at the implementation looks like its doing Slave hashing based only on its IP address, would this cause problems with hosts running multiple Cassandra instances on a single IP address? You already can't have multiple Cassandra instances on a single IP. [the original comment I replied to was deleted, so to avoid losing context, I've edited mine to include it.]
          Hide
          Stu Hood added a comment -

          An interesting benefit provided by CASSANDRA-1016 is that it has a pre-execution step, allowing the Plugin/Trigger to perform a read if it needs to (for example, to update a compound index which the mutation does not contain all columns for).

          Since making the read-then-write step atomic is out of the question, the benefit is questionable (something I think we missed when we were considering 1016). Requiring the user to use mutations containing all fields the trigger might need may be reasonable. Example: for updating the zipcode in an index of "country-zipcode", it would be up to the user to include both fields in the mutation, meaning that they might have to perform a read from their base data first.

          Show
          Stu Hood added a comment - An interesting benefit provided by CASSANDRA-1016 is that it has a pre-execution step, allowing the Plugin/Trigger to perform a read if it needs to (for example, to update a compound index which the mutation does not contain all columns for). Since making the read-then-write step atomic is out of the question, the benefit is questionable (something I think we missed when we were considering 1016). Requiring the user to use mutations containing all fields the trigger might need may be reasonable. Example: for updating the zipcode in an index of "country-zipcode", it would be up to the user to include both fields in the mutation, meaning that they might have to perform a read from their base data first.
          Hide
          Stu Hood added a comment -

          > Faster storage of "dangling" triggers at a slave node
          I mentioned to Maxim at the summit that I think persisting the dangling triggers to the system tables is a good idea, because it preserves the guarantee that a full power loss to the cluster does not lose data. I imagine that it should be possible to gain the performance back somehow, since persisting to disk only requires additional writes (and should only read in node-failure cases).

          Otherwise, I really like the architecture of this change, and only have minor suggestions:

          • findTriggers is much too deeply nested: also, I'd like to see the bufferSize logic contained there moved into a maybeDrainNotificationBuffer method, or something.
          • I'm not sure Arrays.hashCode is a strong enough indicator of uniqueness: you might want to switch to MD5
          • TriggerSlave.deletedTriggers looks fragile (needs to expire old deletions), but switching back to using the system table would resolve that
          Show
          Stu Hood added a comment - > Faster storage of "dangling" triggers at a slave node I mentioned to Maxim at the summit that I think persisting the dangling triggers to the system tables is a good idea, because it preserves the guarantee that a full power loss to the cluster does not lose data. I imagine that it should be possible to gain the performance back somehow, since persisting to disk only requires additional writes (and should only read in node-failure cases). Otherwise, I really like the architecture of this change, and only have minor suggestions: findTriggers is much too deeply nested: also, I'd like to see the bufferSize logic contained there moved into a maybeDrainNotificationBuffer method, or something. I'm not sure Arrays.hashCode is a strong enough indicator of uniqueness: you might want to switch to MD5 TriggerSlave.deletedTriggers looks fragile (needs to expire old deletions), but switching back to using the system table would resolve that
          Hide
          Martin Hentschel added a comment - - edited

          Uploaded a new patch, improvements to the last release:

          • Updated to Cassandra revision 984391
          • Reduced network traffic of notification messages (by bundling notifications and sending only hash values)
          • Faster storage of "dangling" triggers at a slave node (using Java collections instead of Cassandra system tables because we don't need additional durability guarantees)
          Show
          Martin Hentschel added a comment - - edited Uploaded a new patch, improvements to the last release: Updated to Cassandra revision 984391 Reduced network traffic of notification messages (by bundling notifications and sending only hash values) Faster storage of "dangling" triggers at a slave node (using Java collections instead of Cassandra system tables because we don't need additional durability guarantees)
          Hide
          Maxim Grinev added a comment -

          We studied the discussion of CASSANDRA-749

          1) CASSANDRA-749 is about local vs. distributed secondary indexes.
          You decided to start with local indexes but stated that both approaches have their advantages and disadvantages.
          Our triggers allow to implement distributed secondary indexes. So, as concerns indexing, triggers complement CASSANDRA-749 with distributed indexes.

          2) Stu Hood proposed to support view (https://issues.apache.org/jira/browse/CASSANDRA-749?focusedCommentId=12829403&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12829403) as a general mechanism for 'advanced' indexing.

          Our triggers is a mechanism to implement exactly this view approach.

          Stu put the advantates really nicely so I just cite him:

          • "But views are considerably more powerful, since you can store any item in the key or value for the view."
          • "Also, a view is more conducive to duplication of data, which we prefer in Cassandra: rather than having secondary indexes pointing to the one true copy of the data, you can duplicate that data in a view if you'd like, and have it be lazily/eagerly updated serverside."

          Moreover, support for duplicate data allows to map basic SQL operations to Cassandra's data model as described in http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/

          Also, as a general mechanism, triggers/views can be used for other applications such as online analytics or workflow-like (push) data propagation.

          Show
          Maxim Grinev added a comment - We studied the discussion of CASSANDRA-749 1) CASSANDRA-749 is about local vs. distributed secondary indexes. You decided to start with local indexes but stated that both approaches have their advantages and disadvantages. Our triggers allow to implement distributed secondary indexes. So, as concerns indexing, triggers complement CASSANDRA-749 with distributed indexes. 2) Stu Hood proposed to support view ( https://issues.apache.org/jira/browse/CASSANDRA-749?focusedCommentId=12829403&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12829403 ) as a general mechanism for 'advanced' indexing. Our triggers is a mechanism to implement exactly this view approach. Stu put the advantates really nicely so I just cite him: "But views are considerably more powerful, since you can store any item in the key or value for the view." "Also, a view is more conducive to duplication of data, which we prefer in Cassandra: rather than having secondary indexes pointing to the one true copy of the data, you can duplicate that data in a view if you'd like, and have it be lazily/eagerly updated serverside." Moreover, support for duplicate data allows to map basic SQL operations to Cassandra's data model as described in http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/ Also, as a general mechanism, triggers/views can be used for other applications such as online analytics or workflow-like (push) data propagation.
          Hide
          Martin Hentschel added a comment -

          Thanks for referring to CASSANDRA-1016. We were not aware of it and it is similar to our work. We think the main differences of our work to CASSANDRA-1016 are the following:

          • Triggers are executed asynchronously (Stu Hood mentioned it as lazy execution).
          • We provide a fail-over mechanism. If a node, which was responsible for executing a trigger, goes down, replicas will take over to ensure proper execution. We implemented at-least-once semantics of triggers.
          • Triggers are set on column families, instead of the whole database (which might have been implemented by 1016 as well now).

          Asynchronous means that triggers will be executed after the client has received the acknowledgment. Hence the response time of a user request is improved (as opposed to CASSANDRA-1016, which we think implements synchronous execution of triggers). The downside of it is that indexes might be out of sync with the base data sometimes. We think that this is fine for many use cases where Cassandra is the database of choice (as explained in our blog posts).

          Show
          Martin Hentschel added a comment - Thanks for referring to CASSANDRA-1016 . We were not aware of it and it is similar to our work. We think the main differences of our work to CASSANDRA-1016 are the following: Triggers are executed asynchronously (Stu Hood mentioned it as lazy execution ). We provide a fail-over mechanism. If a node, which was responsible for executing a trigger, goes down, replicas will take over to ensure proper execution. We implemented at-least-once semantics of triggers. Triggers are set on column families, instead of the whole database (which might have been implemented by 1016 as well now). Asynchronous means that triggers will be executed after the client has received the acknowledgment. Hence the response time of a user request is improved (as opposed to CASSANDRA-1016 , which we think implements synchronous execution of triggers). The downside of it is that indexes might be out of sync with the base data sometimes. We think that this is fine for many use cases where Cassandra is the database of choice (as explained in our blog posts).
          Hide
          Jonathan Ellis added a comment -

          also, the specific application of indexing is better dealt with by CASSANDRA-749 – pushing it down into "native" logic means we can updating multiple indexes from a single commitlog entry, ensuring the indexes are consistent with the "master" row.

          Show
          Jonathan Ellis added a comment - also, the specific application of indexing is better dealt with by CASSANDRA-749 – pushing it down into "native" logic means we can updating multiple indexes from a single commitlog entry, ensuring the indexes are consistent with the "master" row.
          Hide
          Jonathan Ellis added a comment -

          Is this different from CASSANDRA-1016?

          Show
          Jonathan Ellis added a comment - Is this different from CASSANDRA-1016 ?
          Hide
          Maxim Grinev added a comment -

          Patch for SVN revision 967053 (July 23)

          Show
          Maxim Grinev added a comment - Patch for SVN revision 967053 (July 23)

            People

            • Assignee:
              Vijay
              Reporter:
              Maxim Grinev
              Reviewer:
              Jonathan Ellis
            • Votes:
              44 Vote for this issue
              Watchers:
              70 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development