Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-14867

Histogram overflows potentially leading to writes failing

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • 5.x
    • None
    • cassandra 3.11.1 on ubuntu 16.04

    • Correctness
    • Normal
    • Normal

    Description

      I observed the following in cassandra logs on 1 host of a 6-node cluster:

      ERROR [ScheduledTasks:1] 2018-11-01 17:26:41,277 CassandraDaemon.java:228 - Exception in thread Thread[ScheduledTasks:1,5,main]
      java.lang.IllegalStateException: Unable to compute when histogram overflowed
       at org.apache.cassandra.metrics.DecayingEstimatedHistogramReservoir$EstimatedHistogramReservoirSnapshot.getMean(DecayingEstimatedHistogramReservoir.java:472) ~[apache-cassandra-3.11.1.jar:3.11.1]
       at org.apache.cassandra.net.MessagingService.getDroppedMessagesLogs(MessagingService.java:1263) ~[apache-cassandra-3.11.1.jar:3.11.1]
       at org.apache.cassandra.net.MessagingService.logDroppedMessages(MessagingService.java:1236) ~[apache-cassandra-3.11.1.jar:3.11.1]
       at org.apache.cassandra.net.MessagingService.access$200(MessagingService.java:87) ~[apache-cassandra-3.11.1.jar:3.11.1]
       at org.apache.cassandra.net.MessagingService$4.run(MessagingService.java:507) ~[apache-cassandra-3.11.1.jar:3.11.1]
       at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-3.11.1.jar:3.11.1]
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_172]
       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_172]
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_172]
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_172]
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_172]
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_172]
       at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.1.jar:3.11.1]
       at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_172]
      

      At the same time, this node was failing all writes issued to it. Restarting cassandra on the node brought the cluster into a good state and we stopped seeing the histogram overflow errors.

      Has this issue been observed before? Could the histogram overflows cause writes to fail?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Zbarsky David
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: