Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18422

CEP-15 (Accord) Original and recover coordinators may hit a race condition with PreApply where reads and writes are interleaved, causing one of the coordinators to see the writes from the other

    XMLWordPrintableJSON

Details

    Description

      While verifying CASSANDRA-18364 I saw the following history violation in simulator

      [junit-timeout] Testcase: simulationTest(org.apache.cassandra.simulator.test.ShortAccordSimulationTest)-.jdk1.8:        Caused an ERROR
      [junit-timeout] Failed on seed 0xadaca81151490353
      [junit-timeout] org.apache.cassandra.simulator.SimulationException: Failed on seed 0xadaca81151490353
      [junit-timeout] Caused by: java.lang.AssertionError: History violations detected
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.PaxosSimulation.logAndThrow(PaxosSimulation.java:315)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.PaxosSimulation.isDone(PaxosSimulation.java:278)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.PaxosSimulation$2.hasNext(PaxosSimulation.java:249)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.PaxosSimulation.run(PaxosSimulation.java:224)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.AbstractPairOfSequencesPaxosSimulation.run(AbstractPairOfSequencesPaxosSimulation.java:297)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.PairOfSequencesAccordSimulation.run(PairOfSequencesAccordSimulation.java:62)
      [junit-timeout]         at org.apache.cassandra.simulator.SimulationRunner$Run.run(SimulationRunner.java:374)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.AccordSimulationRunner$Run.run(AccordSimulationRunner.java:39)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.AccordSimulationRunner$Run.run(AccordSimulationRunner.java:30)
      [junit-timeout]         at org.apache.cassandra.simulator.SimulationRunner$BasicCommand.run(SimulationRunner.java:355)
      [junit-timeout]         at org.apache.cassandra.simulator.paxos.AccordSimulationRunner.main(AccordSimulationRunner.java:76)
      [junit-timeout]         at org.apache.cassandra.simulator.test.ShortAccordSimulationTest.simulationTest(ShortAccordSimulationTest.java:32)
      [junit-timeout]         Suppressed: org.apache.cassandra.simulator.paxos.HistoryViolation: Inconsistent sequences on 1: [2, 0, 1, 6, 8, 9, 13, 14, 16, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 40, 43, 47, 48, 49, 54, 56, 57, 58, 60, 64, 68, 70, 71, 74, 76, 79, 80, 83, 85, 87, 87] vs [2, 0, 1, 6, 8, 9, 13, 14, 16, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 40, 43, 47, 48, 49, 54, 56, 57, 58, 60, 64, 68, 70, 71, 74, 76, 79, 80, 83, 85, 87]+90
      [junit-timeout]                 at accord.verify.StrictSerializabilityVerifier$Register.updateSequence(StrictSerializabilityVerifier.java:607)
      [junit-timeout]                 at accord.verify.StrictSerializabilityVerifier$Register.access$100(StrictSerializabilityVerifier.java:576)
      [junit-timeout]                 at accord.verify.StrictSerializabilityVerifier.apply(StrictSerializabilityVerifier.java:825)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.StrictSerializabilityValidator$1.lambda$close$0(StrictSerializabilityValidator.java:66)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.StrictSerializabilityValidator.convertHistoryViolation(StrictSerializabilityValidator.java:89)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.StrictSerializabilityValidator.access$200(StrictSerializabilityValidator.java:27)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.StrictSerializabilityValidator$1.close(StrictSerializabilityValidator.java:66)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.LoggingHistoryValidator$1.close(LoggingHistoryValidator.java:63)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.PairOfSequencesAccordSimulation$ReadWriteOperation.verify(PairOfSequencesAccordSimulation.java:218)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.PaxosSimulation$Operation.accept(PaxosSimulation.java:135)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.PairOfSequencesAccordSimulation$ReadWriteOperation.accept(PairOfSequencesAccordSimulation.java:171)
      [junit-timeout]                 at org.apache.cassandra.simulator.paxos.PaxosSimulation$Operation.accept(PaxosSimulation.java:83)
      [junit-timeout]                 at org.apache.cassandra.simulator.systems.SimulatedActionCallable$1.run(SimulatedActionCallable.java:47)
      [junit-timeout]                 at org.apache.cassandra.simulator.systems.InterceptingExecutor$InterceptingPooledExecutor$WaitingThread.lambda$new$1(InterceptingExecutor.java:317)
      [junit-timeout]                 at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
      [junit-timeout]                 at java.lang.Thread.run(Thread.java:750)
      

      Adding logging to track message passing, reads, and writes, I have the following ordering

      [isolatedExecutor:3]  node3 2023-04-03 12:54:30,200 send(/127.0.0.1:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_PREACCEPT_REQ))
      [isolatedExecutor:3]  node3 2023-04-03 12:54:30,200 send(/127.0.0.2:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_PREACCEPT_REQ))
      [isolatedExecutor:3]  node3 2023-04-03 12:54:30,200 send(/127.0.0.3:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_PREACCEPT_REQ))
      [CommandStore[2]:1]   node1 2023-04-03 12:54:30,208 CS:[2] OP:0xea64a268 reply(/127.0.0.3:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_PREACCEPT_REQ), (from:/127.0.0.1:7012, type:REQUEST_RESPONSE verb:ACCORD_PREACCEPT_RSP))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,209 CS:[2] OP:0x9761fb36 reply(/127.0.0.3:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_PREACCEPT_REQ), (from:/127.0.0.3:7012, type:REQUEST_RESPONSE verb:ACCORD_PREACCEPT_RSP))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,210 CS:[2] OP:0xe62230f2 reply(/127.0.0.3:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_PREACCEPT_REQ), (from:/127.0.0.2:7012, type:REQUEST_RESPONSE verb:ACCORD_PREACCEPT_RSP))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,211 CS:[2] OP:0xc5563e5d send(/127.0.0.1:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,212 CS:[2] OP:0xc5563e5d send(/127.0.0.2:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,212 CS:[2] OP:0xc5563e5d send(/127.0.0.3:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,329 CS:[2] OP:0xa3e62850 send(/127.0.0.1:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_RECOVER_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,329 CS:[2] OP:0xa3e62850 send(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_RECOVER_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,329 CS:[2] OP:0xa3e62850 send(/127.0.0.3:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_RECOVER_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,334 CS:[2] OP:0xf8562cfb reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_RECOVER_REQ), (from:/127.0.0.3:7012, type:REQUEST_RESPONSE verb:ACCORD_RECOVER_RSP))
      [CommandStore[2]:1]   node1 2023-04-03 12:54:30,338 CS:[2] OP:0xcfd2540f reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_RECOVER_REQ), (from:/127.0.0.1:7012, type:REQUEST_RESPONSE verb:ACCORD_RECOVER_RSP))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,338 CS:[2] OP:0xc4cf5af8 reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_RECOVER_REQ), (from:/127.0.0.2:7012, type:REQUEST_RESPONSE verb:ACCORD_RECOVER_RSP))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,340 CS:[2] OP:0x9e4f00f0 send(/127.0.0.1:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_GET_DEPS_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,340 CS:[2] OP:0x9e4f00f0 send(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_GET_DEPS_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,340 CS:[2] OP:0x9e4f00f0 send(/127.0.0.3:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_GET_DEPS_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,343 CS:[2] OP:0xb60153ab reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_GET_DEPS_REQ), (from:/127.0.0.3:7012, type:REQUEST_RESPONSE verb:ACCORD_GET_DEPS_RSP))
      [CommandStore[2]:1]   node1 2023-04-03 12:54:30,344 CS:[2] OP:0xac20a1d6 reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_GET_DEPS_REQ), (from:/127.0.0.1:7012, type:REQUEST_RESPONSE verb:ACCORD_GET_DEPS_RSP))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,345 CS:[2] OP:0xa73f7484 reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_GET_DEPS_REQ), (from:/127.0.0.2:7012, type:REQUEST_RESPONSE verb:ACCORD_GET_DEPS_RSP))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,347 CS:[2] OP:0xfc37fb1a send(/127.0.0.1:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,347 CS:[2] OP:0xfc37fb1a send(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,347 CS:[2] OP:0xfc37fb1a send(/127.0.0.3:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,349 CS:[2] OP:0xff574276 Performing read
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,349 CS:[2] OP:0xff574276 Performing read
      [ReadStage:1]         node3 2023-04-03 12:54:30,351 Performing read; post
      [ReadStage:1]         node3 2023-04-03 12:54:30,351 Performing read; post
      [ReadStage:1]         node3 2023-04-03 12:54:30,351 Performing read; pre
      [ReadStage:1]         node3 2023-04-03 12:54:30,351 Performing read; pre
      [ReadStage:1]         node3 2023-04-03 12:54:30,351 reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ), (from:/127.0.0.3:7012, type:REQUEST_RESPONSE verb:ACCORD_READ_RSP))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,359 Performing coordinated write
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,359 send(/127.0.0.1:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_APPLY_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,359 send(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_APPLY_REQ))
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,360 send(/127.0.0.3:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_APPLY_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,363 CS:[2] OP:0x8bdb6795 Performing read
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,363 CS:[2] OP:0x8bdb6795 Performing read
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,364 CS:[2] OP:0x92e94460 Performing write
      [MutationStage:4]     node3 2023-04-03 12:54:30,364 Performing write: pre
      [MutationStage:4]     node3 2023-04-03 12:54:30,365 Performing write: post
      [ReadStage:1]         node3 2023-04-03 12:54:30,365 Performing read; post
      [ReadStage:1]         node3 2023-04-03 12:54:30,365 Performing read; pre
      [ReadStage:1]         node3 2023-04-03 12:54:30,369 Performing read; post
      [ReadStage:1]         node3 2023-04-03 12:54:30,369 Performing read; pre
      [ReadStage:1]         node3 2023-04-03 12:54:30,369 reply(/127.0.0.3:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_COMMIT_REQ), (from:/127.0.0.3:7012, type:REQUEST_RESPONSE verb:ACCORD_READ_RSP))
      [CommandStore[2]:1]   node1 2023-04-03 12:54:30,370 CS:[2] OP:0xa59dc286 Performing write
      [CommandStore[2]:1]   node2 2023-04-03 12:54:30,374 CS:[2] OP:0xab0f3ca4 Performing write
      [MutationStage:1]     node2 2023-04-03 12:54:30,374 Performing write: pre
      [MutationStage:1]     node2 2023-04-03 12:54:30,375 Performing write: post
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,376 Performing coordinated write
      [MutationStage:3]     node1 2023-04-03 12:54:30,376 Performing write: pre
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,377 send(/127.0.0.1:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_APPLY_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,377 send(/127.0.0.2:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_APPLY_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,377 send(/127.0.0.3:7012, (from:/127.0.0.3:7012, type:IMMEDIATE verb:ACCORD_APPLY_REQ))
      [CommandStore[2]:1]   node3 2023-04-03 12:54:30,382 reply(/127.0.0.2:7012, (from:/127.0.0.2:7012, type:IMMEDIATE verb:ACCORD_APPLY_REQ), (from:/127.0.0.3:7012, type:REQUEST_RESPONSE verb:ACCORD_APPLY_RSP))
      [MutationStage:3]     node1 2023-04-03 12:54:30,382 Performing write: post
      

      (The transaction has a returning select and an auto-read, which is why there are double logs for reads)

      Here we see the following timing

      T00 node3 starts txn
      T01 node3 sends COMMIT
      T02 node2 starts recover
      T03 all nodes ack to the pending recover
      T04 node2 sends COMMIT
      T05 node3 performs reads needed for txn
      T06 node3 sends read results to node2
      T07 node2 performs write locally and send APPLY
      T08 node3 performs write
      T09 node3 performs reads needed for txn
      T10 node3 send reads to node3
      T11 node3 performs write and sends APPLY
      T12 node3 ACKs APPLY to node2
      

      Given the fact the simulator got a response back and didn’t get a preempt, this implies that the original coordinator was able to complete the full transaction without issues and reply back, but the reads/writes were interleaved between node3 and node2 causing the second write to observe the first write

      Attachments

        Activity

          People

            dcapwell David Capwell
            dcapwell David Capwell
            David Capwell
            Ariel Weisberg
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1.5h
                1.5h