Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-14323

Same timestamp insert conflict resolution breaks row-level data consistency

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Low
    • Resolution: Won't Fix
    • None
    • Legacy/Core
    • None
    • Low

    Description

      When inserting multiple rows with the same primary key and timestamp, memtable update logic does not maintain row-level consistency for the key inserted. For example,

      create table test.consistency(pk int PRIMARY KEY , nk1 text, nk2 text);
      BEGIN UNLOGGED BATCH USING TIMESTAMP 1521080773000 
      insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2'); 
      insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1'); 
      APPLY BATCH; 
      select * from test.consistency;
      

      In this case, I would expect either one row overwrites the other so the result of the read would be either

      2, nk1, nk2

      or

      2, nk2, nk1

      but the row retrieved is

      2, nk2, nk2

       which breaks consistency of the writes. This behavior comes from this logic, 

      https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Conflicts.java#L45

      where it appears that the value of the cell itself is used to resolve overwrite conflict which I don't think is the correct way of handling the situation. Shouldn't it either be overwrite or not overwrite for all cases?

      Attachments

        Activity

          People

            Unassigned Unassigned
            rishikthr Rishi Kathera
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: