Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-6948

After Bootstrap or Replace node startup, EXPIRING_MAP_REAPER is shutdown and cannot be restarted, causing callbacks to collect indefinitely

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.0.7, 2.1 beta2
    • None
    • None
    • Normal

    Description

      Since ExpiringMap.shutdown() shuts down the static executor service, it cannot be restarted (and in fact reset() makes no attempt to do so). As such callbacks that receive no response are never removed from the map, and eventually either than server will run out of memory or will loop around the integer space and start reusing messageids that have not been expired, causing assertions to be thrown and messages to fail to be sent. It appears that this situation only arises on bootstrap or node replacement, as MessagingService is shutdown before being attached to the listen address.

      This can cause the following errors to begin occurring in the log:

      ERROR [Native-Transport-Requests:7636] 2014-03-28 13:32:10,638 ErrorMessage.java (line 222) Unexpected exception during request
      java.lang.AssertionError: Callback already exists for id -1665979622! (CallbackInfo(target=/10.106.160.84, callback=org.apache.cassandra.service.WriteResponseHandler@5d36d8ea, serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
      at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
      at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
      at org.apache.cassandra.service.StorageProxy.mutateCounter(StorageProxy.java:984)
      at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:449)
      at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:524)
      at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:521)
      at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:505)
      at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
      at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358)
      at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131)
      at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
      at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
      at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:744)
      ERROR [ReplicateOnWriteStage:102766] 2014-03-28 13:32:10,638 CassandraDaemon.java (line 196) Exception in thread Thread[ReplicateOnWriteStage:102766,5,main]
      java.lang.AssertionError: Callback already exists for id -1665979620! (CallbackInfo(target=/10.106.160.84, callback=org.apache.cassandra.service.WriteResponseHandler@3bdb1a75, serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
      at org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
      at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
      at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:806)
      at org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1074)
      at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1896)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:744)

      Attachments

        1. system.log.gz
          152 kB
          Keith Wright
        2. system.log.1.gz
          363 kB
          Keith Wright
        3. Screen Shot 2014-03-28 at 11.29.24 AM.png
          32 kB
          Keith Wright
        4. Screen Shot 2014-03-28 at 11.27.56 AM.png
          37 kB
          Keith Wright
        5. logs.tar.gz
          7.47 MB
          Keith Wright
        6. logs.old.tar.gz
          7.86 MB
          Keith Wright
        7. cassandra.yaml
          32 kB
          Keith Wright
        8. cassandra.log.min
          4.47 MB
          Keith Wright
        9. 6948-v2.txt
          4 kB
          Brandon Williams
        10. 6948.txt
          7 kB
          Benedict Elliott Smith
        11. 6948.debug.txt
          4 kB
          Benedict Elliott Smith

        Activity

          People

            brandon.williams Brandon Williams
            keithwrightbos Keith Wright
            Brandon Williams
            Benedict Elliott Smith
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: