[CASSANDRA-9793] Log when messages are dropped due to cross_node_timeout - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 2.0.17, 2.1.9, 2.2.1
Component/s: Legacy/Streaming and Messaging
Labels:
None

Description

When a node has clock skew and cross node timeouts are enabled, there's no indication that the messages were dropped due to the cross timeout, just that messages were dropped. This can errantly lead you down a path of troubleshooting a load shedding situation when really you just have clock drift on one node. This is also not simple to troubleshoot, since you have to determine that this node will answer requests, but other nodes won't answer requests from it. If the problem goes away on a reboot (and the machine does one-shot time sync, not continuous) it becomes even harder to detect because you're left with a weird piece of evidence such as "it's fine after a reboot, but comes back in about X days every time."

It would help tremendously if there were a log message indicating how many messages (don't need them broken down by type) were eagerly dropped due to the cross node timeout.

Attachments

Issue Links

links to

2.0 patch

2.1 patch

2.2 patch

Activity

People

Assignee:: Stefania Alborghetti

Reporter:: Brandon Williams

Authors:: Stefania Alborghetti

Reviewers:: Brandon Williams

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 13/Jul/15 22:58

Updated:: 16/Apr/19 09:31

Resolved:: 24/Jul/15 20:55