Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11891

WriteTimeout during commit log replay due to MV lock

    XMLWordPrintableJSON

    Details

    • Severity:
      Critical

      Description

      During commit log replay, if there are materialized views, it's possible for contention on the MV lock to cause a WriteTimeoutException. This makes commit log replay fail, which of course prevents the node from starting up. This generally means that the operator has to move the commitlog segments to avoid replay.

      Here's a stacktrace of this happening on 3.0.5:

      ERROR [main] 2016-05-25 15:10:31,120 CassandraDaemon.java:692 - Exception encountered during startup
      java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      	at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:50) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:372) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:624) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:511) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:406) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:153) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:283) [apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) [apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) [apache-cassandra-3.0.5.jar:3.0.5]
      Caused by: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.get(AbstractLocalAwareExecutorService.java:200) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:365) ~[apache-cassandra-3.0.5.jar:3.0.5]
      	... 9 common frames omitted
      	Suppressed: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      		... 11 common frames omitted
      	Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - received only 0 responses.
      		at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:431)
      		at org.apache.cassandra.db.Keyspace.lambda$apply$62(Keyspace.java:443)
      		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      		at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
      		at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
      		at java.lang.Thread.run(Thread.java:745)
      

      We should ignore the write_rpc_timeout setting while acquiring MV locks if we're on the commitlog replay path.

        Attachments

          Activity

            People

            • Assignee:
              thobbs Tom Hobbs
              Reporter:
              thobbs Tom Hobbs
              Authors:
              Tom Hobbs
              Reviewers:
              T Jake Luciani
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: