Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Correctness - API / Semantic Implementation
-
Normal
-
Challenging
-
Adhoc Test
Description
When LifecycleTransaction is used in a multi-threaded context, we encounter this exception -
java.util.ConcurrentModificationException: null
at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
at java.lang.Iterable.forEach(Iterable.java:74)
at org.apache.cassandra.db.lifecycle.LogReplicaSet.maybeCreateReplica(LogReplicaSet.java:78)
at org.apache.cassandra.db.lifecycle.LogFile.makeRecord(LogFile.java:320)
at org.apache.cassandra.db.lifecycle.LogFile.add(LogFile.java:285)
at org.apache.cassandra.db.lifecycle.LogTransaction.trackNew(LogTransaction.java:136)
at org.apache.cassandra.db.lifecycle.LifecycleTransaction.trackNew(LifecycleTransaction.java:529)
During streaming we create a reference to a LifeCycleTransaction and share it between threads -
This is used in a multi-threaded context insideĀ CassandraIncomingFile which is anĀ IncomingStreamMessage. This is being deserialized in parallel.
LifecycleTransaction is not meant to be used in a multi-threaded context and this leads to streaming failures due to object sharing. On trunk, this object is shared across all threads that transfer sstables in parallel for the given TableId in a StreamSession. There are two options to solve this - make LifecycleTransaction and the associated objects thread safe, scope the transaction to a single CassandraIncomingFile. The consequences of the latter option is that if we experience streaming failure we may have redundant SSTables on disk. This is ok as compaction should clean this up. A third option is we synchronize access in the streaming infrastructure.
Attachments
Issue Links
- relates to
-
CASSANDRA-16225 Followup CASSANDRA-14554
- Resolved