[CASSANDRA-10477] java.lang.AssertionError in StorageProxy.submitHint - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 2.1.13, 2.2.5, 3.0.3, 3.3
Component/s: Legacy/Local Write-Read Paths
Labels:
None
Environment:

CentOS 6, Oracle JVM 1.8.45

Severity:
Normal

Description

A few days after updating from 2.0.15 to 2.1.9 we have the following log entry on 2 of 5 machines:

ERROR [EXPIRING-MAP-REAPER:1] 2015-10-07 17:01:08,041 CassandraDaemon.java:223 - Exception in thread Thread[EXPIRING-MAP-REAPER:1,5,main]
java.lang.AssertionError: /192.168.11.88
        at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:949) ~[apache-cassandra-2.1.9.jar:2.1.9]
        at org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:383) ~[apache-cassandra-2.1.9.jar:2.1.9]
        at org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:363) ~[apache-cassandra-2.1.9.jar:2.1.9]
        at org.apache.cassandra.utils.ExpiringMap$1.run(ExpiringMap.java:98) ~[apache-cassandra-2.1.9.jar:2.1.9]
        at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-2.1.9.jar:2.1.9]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_45]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_45]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_45]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]

192.168.11.88 is the broadcast address of the local machine.

When this is logged the read request latency of the whole cluster becomes very bad, from 6 ms/op to more than 100 ms/op according to OpsCenter. Clients get a lot of timeouts. We need to restart the affected Cassandra node to get back normal read latencies. It seems write latency is not affected.

Disabling hinted handoff using nodetool disablehandoff only prevents the assert from being logged. At some point the read latency becomes bad again. Restarting the node where hinted handoff was disabled results in the read latency being better again.

Attachments

Issue Links

is broken by

CASSANDRA-7342 CAS writes do not have hint functionality.

Resolved

Activity

People

Assignee:: Ariel Weisberg

Reporter:: Severin Leonhardt

Authors:: Ariel Weisberg

Reviewers:: Sylvain Lebresne

Votes:: 3 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 08/Oct/15 07:57

Updated:: 16/Apr/19 09:30

Resolved:: 13/Jan/16 10:38