Cassandra
  1. Cassandra
  2. CASSANDRA-876

Support session (read-after-write) consistency

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Fix Version/s: None
    • Component/s: Core
    • Labels:

      Description

      In http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html and http://www.allthingsdistributed.com/2008/12/eventually_consistent.html Amazon discusses the concept of "eventual consistency." Cassandra uses eventual consistency in a design similar to Dynamo.

      Supporting session consistency would be useful and relatively easy to add: we already have the concept of a Memtable (see http://wiki.apache.org/cassandra/MemtableSSTable ) to "stage" updates in before flushing to disk; if we applied mutations to a session-level memtable on the coordinator machine (that is, the machine the client is connected to), and then did a final merge from that table against query results before handing them to the client, we'd get it almost for free.

      Of course, the devil is in the details; thrift doesn't provide any hooks for session-level data out of the box, but we could do this with a threadlocal approach fairly easily. CASSANDRA-569 has some (probably out of date now) code that might be useful here.

      1. 876-v2.txt
        47 kB
        Jonathan Ellis
      2. CASSANDRA-876.patch
        53 kB
        Brian Palmer

        Activity

        Hide
        Jonathan Ellis added a comment -

        (We're now preserving keyspace information between calls in a threadlocal.)

        Show
        Jonathan Ellis added a comment - (We're now preserving keyspace information between calls in a threadlocal.)
        Hide
        Jonathan Ellis added a comment -

        Memtables are currently dealt with from Table.apply (for writes) and ColumnFamilyStore.getColumnFamily (for reads).

        Show
        Jonathan Ellis added a comment - Memtables are currently dealt with from Table.apply (for writes) and ColumnFamilyStore.getColumnFamily (for reads).
        Hide
        Brian Palmer added a comment -

        Here's my first take on session consistency. I've tested it manually, I'm currently digging through the unit and system tests so I can work out the best way to write some automated testing for the new functionality.

        I'm pretty sure this doesn't work properly against range queries yet, since I didn't touch StorageProxy.getRangeSlice(). I'm also not positive it works properly with deletes (tombstones) in the session Memtable, I'll make sure one of the tests exercises that scenario.

        Let me know if I'm on the right track, it took me a bit of time to work out how all the various layers in the o.a.c.db package interact.

        Show
        Brian Palmer added a comment - Here's my first take on session consistency. I've tested it manually, I'm currently digging through the unit and system tests so I can work out the best way to write some automated testing for the new functionality. I'm pretty sure this doesn't work properly against range queries yet, since I didn't touch StorageProxy.getRangeSlice(). I'm also not positive it works properly with deletes (tombstones) in the session Memtable, I'll make sure one of the tests exercises that scenario. Let me know if I'm on the right track, it took me a bit of time to work out how all the various layers in the o.a.c.db package interact.
        Hide
        Jonathan Ellis added a comment -

        rebased and committed with minor changes

        Show
        Jonathan Ellis added a comment - rebased and committed with minor changes
        Hide
        Jonathan Ellis added a comment -

        forgot about range slice support – should add that before committing. reopening and will attach my edits.

        Show
        Jonathan Ellis added a comment - forgot about range slice support – should add that before committing. reopening and will attach my edits.
        Hide
        Hudson added a comment -

        Integrated in Cassandra #518 (See https://hudson.apache.org/hudson/job/Cassandra/518/)
        per-connection read-your-writes "session" consistency. patch by Brian Palmer; reviewed by jbellis for CASSANDRA-876

        Show
        Hudson added a comment - Integrated in Cassandra #518 (See https://hudson.apache.org/hudson/job/Cassandra/518/ ) per-connection read-your-writes "session" consistency. patch by Brian Palmer; reviewed by jbellis for CASSANDRA-876
        Hide
        Muhammad Adel added a comment - - edited

        Is this issue still open for the latest version of Cassandra? As far as I understand from reading different documentations and articles about MemTables, They are already searched for data before searching the SSTable when performing a query.

        Show
        Muhammad Adel added a comment - - edited Is this issue still open for the latest version of Cassandra? As far as I understand from reading different documentations and articles about MemTables, They are already searched for data before searching the SSTable when performing a query.
        Hide
        Jonathan Ellis added a comment -

        Going to close as wontfix; I don't see "be careful not to do too many writes in your session or you'll OOM because we're saving them for CL.RYW" as a fun explanation to give people.

        Show
        Jonathan Ellis added a comment - Going to close as wontfix; I don't see "be careful not to do too many writes in your session or you'll OOM because we're saving them for CL.RYW" as a fun explanation to give people.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jonathan Ellis
            Reviewer:
            Jonathan Ellis
          • Votes:
            8 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development