Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11891

WriteTimeout during commit log replay due to MV lock

    XMLWordPrintableJSON

Details

    • Critical

    Description

      During commit log replay, if there are materialized views, it's possible for contention on the MV lock to cause a WriteTimeoutException. This makes commit log replay fail, which of course prevents the node from starting up. This generally means that the operator has to move the commitlog segments to avoid replay.

      Here's a stacktrace of this happening on 3.0.5:

      ERROR [main] 2016-05-25 15:10:31,120 CassandraDaemon.java:692 - Exception encountered during startup
      java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      	at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:50) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:372) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:624) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:511) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:406) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:153) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:283) [apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) [apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) [apache-cassandra-3.0.5.jar:3.0.5]
      Caused by: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.get(AbstractLocalAwareExecutorService.java:200) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:365) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	... 9 common frames omitted
      	Suppressed: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      		... 11 common frames omitted
      	Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      		at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:431)
      		at org.apache.cassandra.db.Keyspace.lambda$apply$62(Keyspace.java:443)
      		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      		at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
      		at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
      		at java.lang.Thread.run(Thread.java:745)
      

      We should ignore the write_rpc_timeout setting while acquiring MV locks if we're on the commitlog replay path.

      Attachments

        Activity

          People

            thobbs Tom Hobbs
            thobbs Tom Hobbs
            Tom Hobbs
            T Jake Luciani
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: