Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-17506

Excessive logging with nodes in Gossip state "shutdown"

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • 3.11.x
    • Cluster/Gossip
    • None
    • Correctness
    • Normal
    • Normal
    • User Report
    • All
    • None

    Description

      With Cassandra running as a statefulset in Kubernetes the nodes will change IP address after a rolling reboot of the cluster. While I haven't noticed any operational issues of the cluster itself from this it results in excessive logging of lines like the below:

      INFO Nodes /10.42.7.52 and /10.42.7.55 have the same token 7421423452625866771. Ignoring /10.42.7.52
      

      And less frequent, but regular:

      INFO FatClient /10.42.7.52 has been silent for 30000ms, removing from gossip
      

      These logs now make up the majority of the logs from the system in question causing unnecessary pressure on the logging infrastructure, etc.

      This is an example of what the gossip state looks like:

      nodetool gossipinfo
      /10.42.6.66
        generation:1648711798
        heartbeat:1819
        STATUS:18:NORMAL,-1041160339925870253
        LOAD:1783:2.349975364E9
        SCHEMA:14:06874cde-85cb-3905-b939-9fa68972f835
        DC:10:tc
        RACK:12:rack1
        RELEASE_VERSION:5:3.11.12
        INTERNAL_IP:8:10.42.6.66
        RPC_ADDRESS:4:10.42.6.66
        NET_VERSION:2:11
        HOST_ID:3:d4ff4ea6-2b37-4878-8253-7e03440c3216
        RPC_READY:38:true
        SSTABLE_VERSIONS:6:big-me,big-md
        TOKENS:17:<hidden>
      /10.42.8.54
        generation:1648711838
        heartbeat:1778
        STATUS:18:NORMAL,-100824550225698369
        LOAD:1717:2.514461697E9
        SCHEMA:14:06874cde-85cb-3905-b939-9fa68972f835
        DC:10:mh
        RACK:12:rack1
        RELEASE_VERSION:5:3.11.12
        INTERNAL_IP:8:10.42.8.54
        RPC_ADDRESS:4:10.42.8.54
        NET_VERSION:2:11
        HOST_ID:3:09413185-be66-4ffe-be17-3fa865037e47
        RPC_READY:38:true
        SSTABLE_VERSIONS:6:big-me,big-md
        TOKENS:17:<hidden>
      /10.42.7.54
        generation:1648711286
        heartbeat:2147483647
        STATUS:1757:shutdown,true
        LOAD:477:1.571279055E9
        SCHEMA:14:06874cde-85cb-3905-b939-9fa68972f835
        DC:10:ix
        RACK:12:rack1
        RELEASE_VERSION:5:3.11.12
        INTERNAL_IP:8:10.42.7.54
        RPC_ADDRESS:4:10.42.7.54
        NET_VERSION:2:11
        HOST_ID:3:5a5d3810-874a-4168-9e99-6eab3f6f3cfa
        RPC_READY:1758:false
        SSTABLE_VERSIONS:6:big-me,big-md
        TOKENS:17:<hidden>
      /10.42.7.55
        generation:1648711764
        heartbeat:1856
        STATUS:18:NORMAL,-1095602411864500569
        LOAD:1847:1.571445304E9
        SCHEMA:14:06874cde-85cb-3905-b939-9fa68972f835
        DC:10:ix
        RACK:12:rack1
        RELEASE_VERSION:5:3.11.12
        INTERNAL_IP:8:10.42.7.55
        RPC_ADDRESS:4:10.42.7.55
        NET_VERSION:2:11
        HOST_ID:3:5a5d3810-874a-4168-9e99-6eab3f6f3cfa
        RPC_READY:38:true
        SSTABLE_VERSIONS:6:big-me,big-md
        TOKENS:17:<hidden>
      /10.42.7.52
        generation:1648120338
        heartbeat:2147483647
        STATUS:1759:shutdown,true
        LOAD:606284:1.570018542E9
        SCHEMA:14:06874cde-85cb-3905-b939-9fa68972f835
        DC:10:ix
        RACK:12:rack1
        RELEASE_VERSION:5:3.11.12
        INTERNAL_IP:8:10.42.7.52
        RPC_ADDRESS:4:10.42.7.52
        NET_VERSION:2:11
        HOST_ID:3:5a5d3810-874a-4168-9e99-6eab3f6f3cfa
        RPC_READY:1760:false
        SSTABLE_VERSIONS:6:big-me,big-md
        TOKENS:17:<hidden>
      /10.42.7.53
        generation:1648707234
        heartbeat:2147483647
        STATUS:1766:shutdown,true
        LOAD:4184:1.571429353E9
        SCHEMA:14:06874cde-85cb-3905-b939-9fa68972f835
        DC:10:ix
        RACK:12:rack1
        RELEASE_VERSION:5:3.11.12
        INTERNAL_IP:8:10.42.7.53
        RPC_ADDRESS:4:10.42.7.53
        NET_VERSION:2:11
        HOST_ID:3:5a5d3810-874a-4168-9e99-6eab3f6f3cfa
        RPC_READY:1767:false
        SSTABLE_VERSIONS:6:big-me,big-md
        TOKENS:17:<hidden>
      

       
      While there are likely ways to clean the gossip state to get rid of this I'd rather not get involved in it since the problem will re-appear once the nodes in the cluster are restarted again.
       
      I've tried setting `cassandra.load_ring_state=false` but it does not help as I guess the state is replicated from existing nodes during a rolling reboot and would only be cleared by a full cluster shutdown/startup which is not an option in a production system.
       
      Is there any other way I can avoid this?
       
      The Cassandra version used is 3.11.12.

      Attachments

        Activity

          People

            brandon.williams Brandon Williams
            tobiasg Tobias Gustafsson
            Brandon Williams
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: