Cassandra
  1. Cassandra
  2. CASSANDRA-2897

Secondary indexes without read-before-write

    Details

      Description

      Currently, secondary index updates require a read-before-write to maintain the index consistency. Keeping the index consistent at all time is not necessary however. We could let the (secondary) index get inconsistent on writes and repair those on reads. This would be easy because on reads, we make sure to request the indexed columns anyway, so we can just skip the row that are not needed and repair the index at the same time.

      This does trade work on writes for work on reads. However, read-before-write is sufficiently costly that it will likely be a win overall.

      There is (at least) two small technical difficulties here though:

      1. If we repair on read, this will be racy with writes, so we'll probably have to synchronize there.
      2. We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though.
      1. 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt
        36 kB
        Sam Tunnicliffe
      2. 41ec9fc-2897.txt
        35 kB
        Philip Jenvey
      3. 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt
        53 kB
        Sam Tunnicliffe
      4. 2897-apply-cleanup.txt
        14 kB
        Jonathan Ellis
      5. 0003-CASSANDRA-2897.txt
        47 kB
        Sam Tunnicliffe
      6. 2897-v4.txt
        83 kB
        Jonathan Ellis

        Issue Links

          Activity

          Hide
          Jonathan Ellis added a comment -

          failing test in SSTableReaderTest

          reopened CASSANDRA-4567 for that.

          fixed javadoc typo and committed.

          Show
          Jonathan Ellis added a comment - failing test in SSTableReaderTest reopened CASSANDRA-4567 for that. fixed javadoc typo and committed.
          Hide
          Jonathan Ellis added a comment - - edited

          the test was definitely failing until I got dummyColumn figured out

          FTR, the problem was

          IColumn dummyColumn = new Column(liveColumn.name(), column.value(), column.timestamp());

          Here, column is the column from the index row, so column.value is always empty. To actually delete the old index entry we need to use the column value that was indexed:

          IColumn dummyColumn = new Column(baseColumnName, indexedValue, column.timestamp());

          There was a corresponding bug for the non-composite case as well.

          (Edit: liveColumn.name() == baseColumnName, just thought the latter was a bit more clear.)

          Show
          Jonathan Ellis added a comment - - edited the test was definitely failing until I got dummyColumn figured out FTR, the problem was IColumn dummyColumn = new Column(liveColumn.name(), column.value(), column.timestamp()); Here, column is the column from the index row, so column.value is always empty. To actually delete the old index entry we need to use the column value that was indexed: IColumn dummyColumn = new Column(baseColumnName, indexedValue, column.timestamp()); There was a corresponding bug for the non-composite case as well. (Edit: liveColumn.name() == baseColumnName, just thought the latter was a bit more clear.)
          Hide
          Sam Tunnicliffe added a comment -

          Looks like 0525ae25 introduced that test failure.

          Show
          Sam Tunnicliffe added a comment - Looks like 0525ae25 introduced that test failure.
          Hide
          Sam Tunnicliffe added a comment -

          LGTM so far, except I have a failing test in SSTableReaderTest. testPersistantStatisticsWithSecondaryIndex errors with the message:

          12/08/30 14:18:00 ERROR sstable.SSTableReader: Cannot open build/test/cassandra/data/Keyspace1/Indexed1/Keyspace1-Indexed1.626972746864617465-ia-2 because partitioner does not match org.apache.cassandra.dht.LocalPartitioner != org.apache.cassandra.dht.ByteOrderedPartitioner
          

          Although it looks like that's failing in the same way against trunk (I'll investigate)

          I note, for the record, that composite indexes make my head hurt (CASSANDRA-4586).

          Good, me too.

          I further note that finding the wrong column value being used to create dummyColumn in the index-stale block was a bitch. Not sure how your new tests passed with that. Two bugs cancelling out, I guess.

          Hmm, odd cos that had me stumped for quite a while too, and the test was definitely failing until I got dummyColumn figured out (or thought I had)

          One insignificant nit - the Javadoc comment for PerRowSecondaryIndex.index has a typo.

          Show
          Sam Tunnicliffe added a comment - LGTM so far, except I have a failing test in SSTableReaderTest. testPersistantStatisticsWithSecondaryIndex errors with the message: 12/08/30 14:18:00 ERROR sstable.SSTableReader: Cannot open build/test/cassandra/data/Keyspace1/Indexed1/Keyspace1-Indexed1.626972746864617465-ia-2 because partitioner does not match org.apache.cassandra.dht.LocalPartitioner != org.apache.cassandra.dht.ByteOrderedPartitioner Although it looks like that's failing in the same way against trunk (I'll investigate) I note, for the record, that composite indexes make my head hurt ( CASSANDRA-4586 ). Good, me too. I further note that finding the wrong column value being used to create dummyColumn in the index-stale block was a bitch. Not sure how your new tests passed with that. Two bugs cancelling out, I guess. Hmm, odd cos that had me stumped for quite a while too, and the test was definitely failing until I got dummyColumn figured out (or thought I had) One insignificant nit - the Javadoc comment for PerRowSecondaryIndex.index has a typo.
          Hide
          Jonathan Ellis added a comment -

          v4 pushes all index updates into the "helper" closure, renamed to SecondaryIndexManager.Updater. This cleans up Table.apply even more (no more looping to create a redundant Map of updated columns), and allows index maintenance during compaction relatively cleanly – this is added for the first time here.

          I note, for the record, that composite indexes make my head hurt (CASSANDRA-4586).

          I further note that finding the wrong column value being used to create dummyColumn in the index-stale block was a bitch. Not sure how your new tests passed with that. Two bugs cancelling out, I guess. (Similarly, dummyColumn needed to be introduced in KeysSearcher since just using the index column is wrong even for non-composites, since delete expects a base-data column.)

          I await news of the new bugs I've introduced.

          Show
          Jonathan Ellis added a comment - v4 pushes all index updates into the "helper" closure, renamed to SecondaryIndexManager.Updater. This cleans up Table.apply even more (no more looping to create a redundant Map of updated columns), and allows index maintenance during compaction relatively cleanly – this is added for the first time here. I note, for the record, that composite indexes make my head hurt ( CASSANDRA-4586 ). I further note that finding the wrong column value being used to create dummyColumn in the index-stale block was a bitch . Not sure how your new tests passed with that. Two bugs cancelling out, I guess. (Similarly, dummyColumn needed to be introduced in KeysSearcher since just using the index column is wrong even for non-composites, since delete expects a base-data column.) I await news of the new bugs I've introduced.
          Hide
          Sam Tunnicliffe added a comment -

          Fair enough, though it should be noted that a new instance of SecondaryIndexRepairHelper is created in Memtable.resolve. In this latest patch (0003-CASSANDRA-2897.txt) we avoid this when no indexes are defined for the CF. This latest patch adds @jbellis cleanup to my original (0001) patch, and includes some refactorings from 0002.

          Show
          Sam Tunnicliffe added a comment - Fair enough, though it should be noted that a new instance of SecondaryIndexRepairHelper is created in Memtable.resolve. In this latest patch (0003- CASSANDRA-2897 .txt) we avoid this when no indexes are defined for the CF. This latest patch adds @jbellis cleanup to my original (0001) patch, and includes some refactorings from 0002.
          Hide
          Jonathan Ellis added a comment - - edited

          I prefer Philip's method of pushing down the resolution of indexed values from Table.apply

          Hmm, I disagree. The problem is that Phil's method does a lot of extra allocation, even when no indexed columns are updated. (And even when we just need the size delta, we move from a no-allocation long to a Pair.)

          So I think we need to push the index management down into ACC.

          Somewhat orthogonally, attached is a patch (that applies on top of your latest) to clean out unnecessary code from the apply path; we don't need obsolete indexed columns anymore, and deleting an indexed range doesn't need special casing either.

          Show
          Jonathan Ellis added a comment - - edited I prefer Philip's method of pushing down the resolution of indexed values from Table.apply Hmm, I disagree. The problem is that Phil's method does a lot of extra allocation, even when no indexed columns are updated. (And even when we just need the size delta, we move from a no-allocation long to a Pair.) So I think we need to push the index management down into ACC. Somewhat orthogonally, attached is a patch (that applies on top of your latest) to clean out unnecessary code from the apply path; we don't need obsolete indexed columns anymore, and deleting an indexed range doesn't need special casing either.
          Hide
          Philip Jenvey added a comment -

          Also we can consider removing addAllWithSizeDelta entirely if backwards compat. isn't a concern

          Show
          Philip Jenvey added a comment - Also we can consider removing addAllWithSizeDelta entirely if backwards compat. isn't a concern
          Hide
          Sam Tunnicliffe added a comment -

          I prefer Philip's method of pushing down the resolution of indexed values from Table.apply, IMHO the changes required in his version are less intrusive and result in a cleaner and clearer API. I've added a third patch which merges the two previous, it should apply to trunk (trunk is broken atm so I can't verify for sure) and incorporates Philip's pushdown code with my changes to the IndexSearcher implementations. It has my changes to SchemaLoader & CFMetaDataTest and merges both versions of ColumnFamilyStoreTest

          Show
          Sam Tunnicliffe added a comment - I prefer Philip's method of pushing down the resolution of indexed values from Table.apply, IMHO the changes required in his version are less intrusive and result in a cleaner and clearer API. I've added a third patch which merges the two previous, it should apply to trunk (trunk is broken atm so I can't verify for sure) and incorporates Philip's pushdown code with my changes to the IndexSearcher implementations. It has my changes to SchemaLoader & CFMetaDataTest and merges both versions of ColumnFamilyStoreTest
          Hide
          Philip Jenvey added a comment -

          Here's an alternative patch that also tackles just the non-compaction changes (it's a little stale, against 41ec9fc)

          Briefly looking at Sam's version, I'll note that:

          o Mine handles entire row deletions in Memtable

          o but it lacks changes to CompositesSearcher/SchemaLoader/CFMetaDataTest (though I'm not familiar with these code paths, either)

          o in KeysSearcher, I very likely should be using the compare method from getValueValidator to check for staleness (instead of naively just calling equals)

          Show
          Philip Jenvey added a comment - Here's an alternative patch that also tackles just the non-compaction changes (it's a little stale, against 41ec9fc) Briefly looking at Sam's version, I'll note that: o Mine handles entire row deletions in Memtable o but it lacks changes to CompositesSearcher/SchemaLoader/CFMetaDataTest (though I'm not familiar with these code paths, either) o in KeysSearcher, I very likely should be using the compare method from getValueValidator to check for staleness (instead of naively just calling equals)
          Hide
          Sam Tunnicliffe added a comment -

          Initial patch to remove the synchronized read before write in Table.apply for per-column secondary indexes (its still required for per-row indexes). As far as repairing an out of date index goes, the iterators created in the two SecondaryIndexSearcher implementations now validate indexed values against those stored in the primary cf. If these don't match, the indexed value is removed (not updated). We also now check for indexed values when Memtable replaces a column and delete obsoleted index values. There's nothing in this patch to deal handle bulk repairs during compaction.

          Show
          Sam Tunnicliffe added a comment - Initial patch to remove the synchronized read before write in Table.apply for per-column secondary indexes (its still required for per-row indexes). As far as repairing an out of date index goes, the iterators created in the two SecondaryIndexSearcher implementations now validate indexed values against those stored in the primary cf. If these don't match, the indexed value is removed (not updated). We also now check for indexed values when Memtable replaces a column and delete obsoleted index values. There's nothing in this patch to deal handle bulk repairs during compaction.
          Hide
          Edward Capriolo added a comment -

          So I hit this on my Casbase project a bit. In the end, it is a choice of the user what they want:

          https://github.com/edwardcapriolo/casbase

          Table.IndexRepair
          --REPAIR_ON_READ- correct tables each read
          --REPAIR_ON_WRITE- Read before write on insert, invalidates indexes
          --REPAIR_NONE- Take no action, assumes no deletes or overwrites,or that user will handle/not care
          

          Repair on read is the most interesting case to this discussion.

          Someone may issue a query like this, that would never repair.

          select * from cf1 where state='TX'
          

          That is not exactly true....
          Because to constitute the * result set of cf1 one has to go to cf1 and pull back the matching rows.

          IE. You have an index of state='TX' that only gives you a result set of:

          pk |state
          row 1 in other table | 'TX'
          row 2 in other table | 'TX'
          

          But the user wants all the columns of row 1, so you are going to be reading those to build the final result.
          If you read row 1 and find that it no longer exists well you can fix the index in this case.

          Now if the user just asked:

          select pk,state from cf1 where state='TX'
          

          In this case you could answer this entire question from the index and might get stale results, because you do not yet know if PK was deleted.

          I think exposing knobs like REPAIR_ON_READ, REPAIR_ON_WRITE, REPAIR_NONE is the way to go. Many use cases may never modify or overwrite a row so the entire 'repair' is not needed.

          Then again I am a power user that does not expect Cassandra to work like a relational database, maybe most people using 2x indexes do.

          Show
          Edward Capriolo added a comment - So I hit this on my Casbase project a bit. In the end, it is a choice of the user what they want: https://github.com/edwardcapriolo/casbase Table.IndexRepair --REPAIR_ON_READ- correct tables each read --REPAIR_ON_WRITE- Read before write on insert, invalidates indexes --REPAIR_NONE- Take no action, assumes no deletes or overwrites,or that user will handle/not care Repair on read is the most interesting case to this discussion. Someone may issue a query like this, that would never repair. select * from cf1 where state='TX' That is not exactly true.... Because to constitute the * result set of cf1 one has to go to cf1 and pull back the matching rows. IE. You have an index of state='TX' that only gives you a result set of: pk |state row 1 in other table | 'TX' row 2 in other table | 'TX' But the user wants all the columns of row 1, so you are going to be reading those to build the final result. If you read row 1 and find that it no longer exists well you can fix the index in this case. Now if the user just asked: select pk,state from cf1 where state='TX' In this case you could answer this entire question from the index and might get stale results, because you do not yet know if PK was deleted. I think exposing knobs like REPAIR_ON_READ, REPAIR_ON_WRITE, REPAIR_NONE is the way to go. Many use cases may never modify or overwrite a row so the entire 'repair' is not needed. Then again I am a power user that does not expect Cassandra to work like a relational database, maybe most people using 2x indexes do.
          Hide
          Jeremy Hanna added a comment -

          so you're saying that it won't persist forever (as per your 18 May comment) because of the index delete that you would add to memtable update. That sounds fair. We were just talking about the 18 May comment and thinking that would be fine if there was some either automated regular cleanup or at least a nodetool type of command to clean it up. Otherwise to clean it up you'd have to delete and re-create the index or less intrusive, do a full scan of the index.

          Show
          Jeremy Hanna added a comment - so you're saying that it won't persist forever (as per your 18 May comment) because of the index delete that you would add to memtable update. That sounds fair. We were just talking about the 18 May comment and thinking that would be fine if there was some either automated regular cleanup or at least a nodetool type of command to clean it up. Otherwise to clean it up you'd have to delete and re-create the index or less intrusive, do a full scan of the index.
          Hide
          Jonathan Ellis added a comment - - edited

          this doesn't work for us, since (unlike Bigtable) we don't make an effort to preserve all older versions of a column on disk

          We can fix this without having to go full-on Bigtable with value retention. "All" we need to do is have the memtable update code special case replacements in the CF map to issue an index delete against the replaced value. Messy, but not as messy as having to maintain two KEYS index implementations.

          So, we can add that as step 2.5 to my list above and we should be good.

          Show
          Jonathan Ellis added a comment - - edited this doesn't work for us, since (unlike Bigtable) we don't make an effort to preserve all older versions of a column on disk We can fix this without having to go full-on Bigtable with value retention. "All" we need to do is have the memtable update code special case replacements in the CF map to issue an index delete against the replaced value. Messy, but not as messy as having to maintain two KEYS index implementations. So, we can add that as step 2.5 to my list above and we should be good.
          Hide
          Jonathan Ellis added a comment -

          it generates both new, correct index entries, AND tombstones for old, invalid ones (the column entries in the parent CF that get discarded during compaction

          Actually, this doesn't work for us, since (unlike Bigtable) we don't make an effort to preserve all older versions of a column on disk; we'll happily throw away old values that got overwritten in the memtable without a trace.

          It may be that the best we can do is an alternate index implementation that does the "purge obsolete entries at read time" but makes no attempt to purge otherwise (which would make it only suitable for insert-mostly datasets or read-mostly datasets).

          Show
          Jonathan Ellis added a comment - it generates both new, correct index entries, AND tombstones for old, invalid ones (the column entries in the parent CF that get discarded during compaction Actually, this doesn't work for us, since (unlike Bigtable) we don't make an effort to preserve all older versions of a column on disk; we'll happily throw away old values that got overwritten in the memtable without a trace. It may be that the best we can do is an alternate index implementation that does the "purge obsolete entries at read time" but makes no attempt to purge otherwise (which would make it only suitable for insert-mostly datasets or read-mostly datasets).
          Hide
          Jeremy Hanna added a comment -

          That said - our experience may be unique with the 3 secondary indexes. It makes sense it would affect the write path but it was surprising that 3 would have that much of an effect. Also the proposed implementation seems to be closer to the philosophy of Cassandra generally - crazy fast write path (or simply don't slow the write path down), then sort out some of it on the read path.

          Show
          Jeremy Hanna added a comment - That said - our experience may be unique with the 3 secondary indexes. It makes sense it would affect the write path but it was surprising that 3 would have that much of an effect. Also the proposed implementation seems to be closer to the philosophy of Cassandra generally - crazy fast write path (or simply don't slow the write path down), then sort out some of it on the read path.
          Hide
          Jeremy Hanna added a comment -

          fwiw - we found the read-before-write to be pretty much debilitating for write throughput. We have a total of 5 secondary indexes, 3 on one column family. We looked at iotop while doing heavy writes on the column family and the main IO that was happening was on Cassandra reads. We're auditing which secondary indexes that we really need because it affects write throughput that much.

          In other words, resolving this ticket would make it so secondary indexes were no longer a compromise wrt write performance.

          Show
          Jeremy Hanna added a comment - fwiw - we found the read-before-write to be pretty much debilitating for write throughput. We have a total of 5 secondary indexes, 3 on one column family. We looked at iotop while doing heavy writes on the column family and the main IO that was happening was on Cassandra reads. We're auditing which secondary indexes that we really need because it affects write throughput that much. In other words, resolving this ticket would make it so secondary indexes were no longer a compromise wrt write performance.
          Hide
          Jonathan Ellis added a comment -

          To summarize, I think an implementation of this would look something like this:

          1. Get rid of the synchronized oldIndexColumns code in Table.apply; on write, all we need to do is add an index entry for the newly-written value
          2. Index read code (ColumnFamilyStore.getIndexedIterator) will need to double-check the rows returned by the index to make sure the column value still matches the indexed one; if it does not, delete the index entry so we don't keep making the same mistake
          3. (The hard part, as described in the immediately previous comment) Change AbstractCompactionRow write implementations to delete old index entries as well; that is, we create index tombstones for each column value that is NOT the one retained after the compaction merge. Specifically, PrecompactedRow::merge and LazilyCompactedRow::Reducer.
          4. Existing index tests (in ColumnFamilyStoreTest::testIndexDeletions and ::testIndexUpdate) are fine for parts 1-2, but we should add a new test for 3 to make sure that index-update-on-compaction works as advertised
          Show
          Jonathan Ellis added a comment - To summarize, I think an implementation of this would look something like this: Get rid of the synchronized oldIndexColumns code in Table.apply; on write, all we need to do is add an index entry for the newly-written value Index read code (ColumnFamilyStore.getIndexedIterator) will need to double-check the rows returned by the index to make sure the column value still matches the indexed one; if it does not, delete the index entry so we don't keep making the same mistake (The hard part, as described in the immediately previous comment) Change AbstractCompactionRow write implementations to delete old index entries as well; that is, we create index tombstones for each column value that is NOT the one retained after the compaction merge. Specifically, PrecompactedRow::merge and LazilyCompactedRow::Reducer. Existing index tests (in ColumnFamilyStoreTest::testIndexDeletions and ::testIndexUpdate) are fine for parts 1-2, but we should add a new test for 3 to make sure that index-update-on-compaction works as advertised
          Hide
          Jonathan Ellis added a comment -

          Doug added over on the Hypertable post,

          In Hypertable, the way deletes are handled is by inserting delete records (tombstones), so during compaction the secondary index is purged of stale entries by bulk inserting a bunch of delete records. Since Hypertable is essentially a LSM tree, bulk inserts are very efficient and require no random i/o.

          I think I understand: it generates both new, correct index entries, AND tombstones for old, invalid ones (the column entries in the parent CF that get discarded during compaction).

          That's a fair bit of work for us, to change compaction to expose more than just the surviving value, but doable.

          I like this idea, it should lower the overhead of indexes a lot, even for SSD deployments. (The read-before-write that the current implementation requires extra locking, as well as the read itself.)

          Show
          Jonathan Ellis added a comment - Doug added over on the Hypertable post, In Hypertable, the way deletes are handled is by inserting delete records (tombstones), so during compaction the secondary index is purged of stale entries by bulk inserting a bunch of delete records. Since Hypertable is essentially a LSM tree, bulk inserts are very efficient and require no random i/o. I think I understand: it generates both new, correct index entries, AND tombstones for old, invalid ones (the column entries in the parent CF that get discarded during compaction). That's a fair bit of work for us, to change compaction to expose more than just the surviving value, but doable. I like this idea, it should lower the overhead of indexes a lot, even for SSD deployments. (The read-before-write that the current implementation requires extra locking, as well as the read itself.)
          Hide
          Jonathan Ellis added a comment -

          We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though

          This is what I'm stuck on, too.

          FWIW Hypertable takes this approach (http://www.hypertable.com/blog/secondary_indices_have_arrived/) and doesn't have a magic wand; they just do brute force random i/o during compaction to check for stale index data.

          I'm inclined to think that we could get by with the update-on-read approach, plus a manual index rebuild (which we already have) as a Big Hammer if it gets too far behind.

          If we can estimate a staleness factor we could even kick that off automatically, potentially.

          Show
          Jonathan Ellis added a comment - We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though This is what I'm stuck on, too. FWIW Hypertable takes this approach ( http://www.hypertable.com/blog/secondary_indices_have_arrived/ ) and doesn't have a magic wand; they just do brute force random i/o during compaction to check for stale index data. I'm inclined to think that we could get by with the update-on-read approach, plus a manual index rebuild (which we already have) as a Big Hammer if it gets too far behind. If we can estimate a staleness factor we could even kick that off automatically, potentially.
          Hide
          Sylvain Lebresne added a comment -

          If the index isn't upto date and index clauses are the primary way of pulling rows then this data will never be repaired.

          Well, right now I'm pretty sure we always query the actual rows (including at least the 'state' column). But arguably this could be optimized out if we know that the predicate is provably empty in our current scheme, this would be optimizable in the one proposed here. It's clearly a trade-off. I do believe it would likely be a win overall (without having any certainty though), but I'm biased in that I hate that synchronized read-before-write thing.

          Show
          Sylvain Lebresne added a comment - If the index isn't upto date and index clauses are the primary way of pulling rows then this data will never be repaired. Well, right now I'm pretty sure we always query the actual rows (including at least the 'state' column). But arguably this could be optimized out if we know that the predicate is provably empty in our current scheme, this would be optimizable in the one proposed here. It's clearly a trade-off. I do believe it would likely be a win overall (without having any certainty though), but I'm biased in that I hate that synchronized read-before-write thing.
          Hide
          T Jake Luciani added a comment -

          Sylvain, this seems problematic since clients often use indexes to identify rows to read, like:
          select * from cf1 where state='TX'

          If the index isn't upto date and index clauses are the primary way of pulling rows then this data will never be repaired.

          Show
          T Jake Luciani added a comment - Sylvain, this seems problematic since clients often use indexes to identify rows to read, like: select * from cf1 where state='TX' If the index isn't upto date and index clauses are the primary way of pulling rows then this data will never be repaired.
          Hide
          Stu Hood added a comment -

          CASSANDRA-1472 implements an index without read-before-write, and has some code in patch 0011 to make secondary indexes pluggable.

          Show
          Stu Hood added a comment - CASSANDRA-1472 implements an index without read-before-write, and has some code in patch 0011 to make secondary indexes pluggable.
          Hide
          Sylvain Lebresne added a comment -

          I supppose timestamp tie would be problematic but we could easily don't care about that. What is more problematic is the case where you read the row, and you get nothing back at all (no tombstone) for the indexed column. You don't have any timestamp to use then.

          Show
          Sylvain Lebresne added a comment - I supppose timestamp tie would be problematic but we could easily don't care about that. What is more problematic is the case where you read the row, and you get nothing back at all (no tombstone) for the indexed column. You don't have any timestamp to use then.
          Hide
          Ryan King added a comment -

          Can't we deal with the races by properly using timestamps?

          Show
          Ryan King added a comment - Can't we deal with the races by properly using timestamps?

            People

            • Assignee:
              Sam Tunnicliffe
              Reporter:
              Sylvain Lebresne
              Reviewer:
              Jonathan Ellis
            • Votes:
              3 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development