Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-14554

Streaming needs to synchronise access to LifecycleTransaction

    XMLWordPrintableJSON

    Details

    • Bug Category:
      Correctness - API / Semantic Implementation
    • Severity:
      Normal
    • Complexity:
      Challenging
    • Discovered By:
      Adhoc Test

      Description

      When LifecycleTransaction is used in a multi-threaded context, we encounter this exception -

      java.util.ConcurrentModificationException: null
      at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
      at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
      at java.lang.Iterable.forEach(Iterable.java:74)
      at org.apache.cassandra.db.lifecycle.LogReplicaSet.maybeCreateReplica(LogReplicaSet.java:78)
      at org.apache.cassandra.db.lifecycle.LogFile.makeRecord(LogFile.java:320)
      at org.apache.cassandra.db.lifecycle.LogFile.add(LogFile.java:285)
      at org.apache.cassandra.db.lifecycle.LogTransaction.trackNew(LogTransaction.java:136)
      at org.apache.cassandra.db.lifecycle.LifecycleTransaction.trackNew(LifecycleTransaction.java:529)

      During streaming we create a reference to a LifeCycleTransaction and share it between threads -

      https://github.com/apache/cassandra/blob/5cc68a87359dd02412bdb70a52dfcd718d44a5ba/src/java/org/apache/cassandra/db/streaming/CassandraStreamReader.java#L156

      This is used in a multi-threaded context insideĀ CassandraIncomingFile which is anĀ IncomingStreamMessage. This is being deserialized in parallel.

      LifecycleTransaction is not meant to be used in a multi-threaded context and this leads to streaming failures due to object sharing. On trunk, this object is shared across all threads that transfer sstables in parallel for the given TableId in a StreamSession. There are two options to solve this - make LifecycleTransaction and the associated objects thread safe, scope the transaction to a single CassandraIncomingFile. The consequences of the latter option is that if we experience streaming failure we may have redundant SSTables on disk. This is ok as compaction should clean this up. A third option is we synchronize access in the streaming infrastructure.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                stefania Stefania Alborghetti
                Reporter:
                djoshi Dinesh Joshi
                Authors:
                Stefania Alborghetti
                Reviewers:
                Benedict Elliott Smith, Robert Stupp
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: