Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Fix Version/s: 0.8 beta 1
    • Component/s: Core
    • Labels:
      None

      Description

      As discussed at the Digg-hosted hackathon.

      First off, this needs a better name, the idea isn't exactly like coprocessors from BigTable and this entry should be considered a stub for now (Stu and Marius should be able to provide more details).

      The idea is that for mutation operations, we should all the user to run a routine that has access to the "old" version of the data and the "new" version, and can take action.

      At a bare minimum, this should be capable of implementing distributed secondary indexes.

      1. CASSANDRA-1016.patch
        7 kB
        Jeff Hodges
      2. CASSANDRA-1016-2.patch
        21 kB
        Jeff Hodges

        Issue Links

          Activity

          Hide
          Jeff Hodges added a comment - - edited

          Named them Callbacks. WriteCallbacks, specifically.

          This is a 0.0.1 implementation of them. The ArrayList of WriteCallbacks is called before writing is done (and WriteCallback#beforeWrite(RowMutation, Table) is called on it) and after (where WriteCallback#afterWrite(RowMutation, Table) is called).

          This does no filtering so even system table writes will be caught by it.

          A singleton DatabaseDescriptor.callbackRunner is created and used.

          No effort into making an easy API for new Message creation has been made (e.g. make it easy to send out new, different writes to the cluster).

          No effort has been made to avoid slowdowns in the write path. ThreadPoolExecutors would probably be a good idea. The CallbackRunner is a good place to put such work.

          It would probably be preferable that the WriteCallback and CallbackRunner were in a package other than org.apache.cassandra.db. This will have to be evaluated as we add features and other classes.

          The WriteCallback class is just a simple class. It probably should be an interface. I have concern about that because we assume that the default no-args constructor works correctly when we load them from the config file.

          Show
          Jeff Hodges added a comment - - edited Named them Callbacks. WriteCallbacks, specifically. This is a 0.0.1 implementation of them. The ArrayList of WriteCallbacks is called before writing is done (and WriteCallback#beforeWrite(RowMutation, Table) is called on it) and after (where WriteCallback#afterWrite(RowMutation, Table) is called). This does no filtering so even system table writes will be caught by it. A singleton DatabaseDescriptor.callbackRunner is created and used. No effort into making an easy API for new Message creation has been made (e.g. make it easy to send out new, different writes to the cluster). No effort has been made to avoid slowdowns in the write path. ThreadPoolExecutors would probably be a good idea. The CallbackRunner is a good place to put such work. It would probably be preferable that the WriteCallback and CallbackRunner were in a package other than org.apache.cassandra.db. This will have to be evaluated as we add features and other classes. The WriteCallback class is just a simple class. It probably should be an interface. I have concern about that because we assume that the default no-args constructor works correctly when we load them from the config file.
          Hide
          Stu Hood added a comment -

          Thanks for taking the initiative here!

          Would it be better to configure Callbacks per CF? Otherwise each Callback would need to have conditionals to ignore CFs they weren't interested in, and they would need to store whatever state they were concerned with on a per-cf basis.

          Also, as jbellis would say: this needs a raison d'etre. Even a really simple example would be helpful, but perhaps the best way to determine exactly what the interface needs would be to open another ticket to begin implementing toy distributed indexes.

          Show
          Stu Hood added a comment - Thanks for taking the initiative here! Would it be better to configure Callbacks per CF? Otherwise each Callback would need to have conditionals to ignore CFs they weren't interested in, and they would need to store whatever state they were concerned with on a per-cf basis. Also, as jbellis would say: this needs a raison d'etre. Even a really simple example would be helpful, but perhaps the best way to determine exactly what the interface needs would be to open another ticket to begin implementing toy distributed indexes.
          Hide
          Stu Hood added a comment -

          Also, I'm fairly certain Callbacks will need parameters. If distributed indexes are going to be exposed as column families, specifying a name would be reasonable, and you'd need to indicate the indexed field somehow.

          Since adding CFs via the config file is deprecated anyway, parameters to a Callback could be a string->bytes map, to be passed in via Thrift?

          Show
          Stu Hood added a comment - Also, I'm fairly certain Callbacks will need parameters. If distributed indexes are going to be exposed as column families, specifying a name would be reasonable, and you'd need to indicate the indexed field somehow. Since adding CFs via the config file is deprecated anyway, parameters to a Callback could be a string->bytes map, to be passed in via Thrift?
          Hide
          Jonathan Ellis added a comment -

          yes, at the very least we need something that uses this before committing. hand-waving that "we should be able to implement X with this" doesn't cut it.

          (to be more specific, we need an X, and we also need a good reason that we're not committing premature generalization if X could be done more simply without the "framework" bits.)

          Show
          Jonathan Ellis added a comment - yes, at the very least we need something that uses this before committing. hand-waving that "we should be able to implement X with this" doesn't cut it. (to be more specific, we need an X, and we also need a good reason that we're not committing premature generalization if X could be done more simply without the "framework" bits.)
          Hide
          Stu Hood added a comment -

          I'm extremely curious about the progress here: are you guys working on this in a public branch anywhere?

          Show
          Stu Hood added a comment - I'm extremely curious about the progress here: are you guys working on this in a public branch anywhere?
          Hide
          marius a. eriksen added a comment -

          hey Stu - Jeff and i got together today and reworked the patch a bit,
          including writing a trivial distributed indexer.

          i think a better name for this is simply "Plugins" (and we've named
          it as such in the latest patch).

          the code is at:

          http://github.com/mariusaeriksen/cassandra

          the current toy indexer simply inverts the mutation in the same CF:

          [default@foo] set cf1['row3']['col'] = 'HEYOYO'
          Value inserted.
          [default@foo] get cf1['HEYOYO']['index']
          => (column=696e646578, value=row3, timestamp=0)

          Show
          marius a. eriksen added a comment - hey Stu - Jeff and i got together today and reworked the patch a bit, including writing a trivial distributed indexer. i think a better name for this is simply "Plugins" (and we've named it as such in the latest patch). the code is at: http://github.com/mariusaeriksen/cassandra the current toy indexer simply inverts the mutation in the same CF: [default@foo] set cf1 ['row3'] ['col'] = 'HEYOYO' Value inserted. [default@foo] get cf1 ['HEYOYO'] ['index'] => (column=696e646578, value=row3, timestamp=0)
          Hide
          Jan Kantert added a comment - - edited

          Hi Jeff,

          why did you implement this in db.Table? Why didnt you implement this in the service.StorageProxy?

          Correct me if i'm wrong: With replication factor N=3 and quorum write the "coprocessor" will get executed 2 times on every write (once on every node). Where is the advantage of this solution?

          In my index implementation I hook into service.StorageProxy mutate and mutateBlocking for writes. Are there any disadvantages to add it here?

          Regards,
          Jan

          Show
          Jan Kantert added a comment - - edited Hi Jeff, why did you implement this in db.Table? Why didnt you implement this in the service.StorageProxy? Correct me if i'm wrong: With replication factor N=3 and quorum write the "coprocessor" will get executed 2 times on every write (once on every node). Where is the advantage of this solution? In my index implementation I hook into service.StorageProxy mutate and mutateBlocking for writes. Are there any disadvantages to add it here? Regards, Jan
          Hide
          Jeff Hodges added a comment -

          Plugin patch. Includes contrib naive distributed indexer example.

          Show
          Jeff Hodges added a comment - Plugin patch. Includes contrib naive distributed indexer example.
          Hide
          Jeff Hodges added a comment -

          Renaming Coprocessors to Plugins.

          Show
          Jeff Hodges added a comment - Renaming Coprocessors to Plugins.
          Hide
          Jeff Hodges added a comment -

          The assumption that we want to drop anything coming into a overloaded Plugin TPE is probably sound, but I am writing this comment solely to spark discussion about it if others think not.

          I certainly think that using a caller-runs policy for it is a bad idea. I would prefer not to slow down the write path on a Plugin overload, and drop data instead. We're all going to have to run consistency checking programs for these kinds of things, anyhow.

          Of course, I would accept patches that make it configurable.

          Show
          Jeff Hodges added a comment - The assumption that we want to drop anything coming into a overloaded Plugin TPE is probably sound, but I am writing this comment solely to spark discussion about it if others think not. I certainly think that using a caller-runs policy for it is a bad idea. I would prefer not to slow down the write path on a Plugin overload, and drop data instead. We're all going to have to run consistency checking programs for these kinds of things, anyhow. Of course, I would accept patches that make it configurable.
          Hide
          Jeff Hodges added a comment -

          Jan, is your index implementation up somewhere?

          Show
          Jeff Hodges added a comment - Jan, is your index implementation up somewhere?
          Hide
          Edward Ribeiro added a comment - - edited

          Jeff,

          What about using a CopyOnWriteArrayList<WriteCallback> instead of using a ArrayList?

          If the writeCallbacks list is read heavy then CopyOnWriteArrayList is better because 1) you get a concurrent list cheaply and 2) you avoid ConcurrentModificationException being ever thrown. Just a suggestion.

          Regards,
          Ed

          Show
          Edward Ribeiro added a comment - - edited Jeff, What about using a CopyOnWriteArrayList<WriteCallback> instead of using a ArrayList? If the writeCallbacks list is read heavy then CopyOnWriteArrayList is better because 1) you get a concurrent list cheaply and 2) you avoid ConcurrentModificationException being ever thrown. Just a suggestion. Regards, Ed
          Hide
          Stu Hood added a comment -

          Resolving as a dupe of CASSANDRA-1311, which is much further along. Thanks for the initial work here!

          Show
          Stu Hood added a comment - Resolving as a dupe of CASSANDRA-1311 , which is much further along. Thanks for the initial work here!

            People

            • Assignee:
              Jeff Hodges
              Reporter:
              Ryan King
            • Votes:
              6 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development