Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10477

java.lang.AssertionError in StorageProxy.submitHint

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Normal

    Description

      A few days after updating from 2.0.15 to 2.1.9 we have the following log entry on 2 of 5 machines:

      ERROR [EXPIRING-MAP-REAPER:1] 2015-10-07 17:01:08,041 CassandraDaemon.java:223 - Exception in thread Thread[EXPIRING-MAP-REAPER:1,5,main]
      java.lang.AssertionError: /192.168.11.88
              at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:949) ~[apache-cassandra-2.1.9.jar:2.1.9]
              at org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:383) ~[apache-cassandra-2.1.9.jar:2.1.9]
              at org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:363) ~[apache-cassandra-2.1.9.jar:2.1.9]
              at org.apache.cassandra.utils.ExpiringMap$1.run(ExpiringMap.java:98) ~[apache-cassandra-2.1.9.jar:2.1.9]
              at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-2.1.9.jar:2.1.9]
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45]
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_45]
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_45]
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_45]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
              at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
      

      192.168.11.88 is the broadcast address of the local machine.

      When this is logged the read request latency of the whole cluster becomes very bad, from 6 ms/op to more than 100 ms/op according to OpsCenter. Clients get a lot of timeouts. We need to restart the affected Cassandra node to get back normal read latencies. It seems write latency is not affected.

      Disabling hinted handoff using nodetool disablehandoff only prevents the assert from being logged. At some point the read latency becomes bad again. Restarting the node where hinted handoff was disabled results in the read latency being better again.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            aweisberg Ariel Weisberg Assign to me
            leonhardt Severin Leonhardt
            Ariel Weisberg
            Sylvain Lebresne
            Votes:
            3 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment