Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15667

StreamResultFuture check for completeness is inconsistent, leading to races

    XMLWordPrintableJSON

    Details

    • Bug Category:
      Correctness
    • Severity:
      Normal
    • Complexity:
      Normal
    • Discovered By:
      Adhoc Test
    • Platform:
      All
    • Impacts:
      None
    • Since Version:
    • Test and Documentation Plan:
      Hide

      https://app.circleci.com/pipelines/github/maxtomassi/cassandra?branch=15667-4.0

      It seems like JVM dtests fail to run properly. Lots of logs like this:

      [junit-timeout] Testcase: prepareRPCTimeout[PARALLEL/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):	Caused an ERROR
      [junit-timeout] org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
      [junit-timeout] java.lang.NoSuchMethodError: org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
      [junit-timeout] 	at org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
      [junit-timeout] 	at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
      [junit-timeout] 	at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
      [junit-timeout] 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      [junit-timeout] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      [junit-timeout] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      [junit-timeout] 	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
      [junit-timeout] 	at java.lang.Thread.run(Thread.java:748)
      
      Show
      https://app.circleci.com/pipelines/github/maxtomassi/cassandra?branch=15667-4.0 It seems like JVM dtests fail to run properly. Lots of logs like this: [junit-timeout] Testcase: prepareRPCTimeout[PARALLEL/ true ](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): Caused an ERROR [junit-timeout] org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/ String ;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; [junit-timeout] java.lang.NoSuchMethodError: org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/ String ;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; [junit-timeout] at org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) [junit-timeout] at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) [junit-timeout] at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) [junit-timeout] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit-timeout] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [junit-timeout] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [junit-timeout] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [junit-timeout] at java.lang. Thread .run( Thread .java:748)

      Description

      StreamResultFuture#maybeComplete() uses StreamCoordinator#hasActiveSessions() to determine if all sessions are completed, but then accesses each session state via StreamCoordinator#getAllSessionInfo(): this is inconsistent, as the former relies on the actual StreamSession state, while the latter on the SessionInfo state, and the two are concurrently updated with no coordination whatsoever.

      This leads to races, i.e. apparent in some dtest spurious failures, such as TestBootstrap.resumable_bootstrap_test in CASSANDRA-15614 cc Ekaterina Dimitrova.

        Attachments

        1. log_bootstrap_resumable
          63 kB
          Ekaterina Dimitrova

          Issue Links

            Activity

              People

              • Assignee:
                maxtomassi Massimiliano Tomassi
                Reporter:
                sbtourist Sergio Bossa
                Authors:
                Massimiliano Tomassi
                Reviewers:
                Sergio Bossa, Zhao Yang
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: