Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-14930

decommission may cause timeout because messaging backlog is cleared

    XMLWordPrintableJSON

Details

    • Normal
    • Normal

    Description

      On a 3-node cluster with RF=2, decommissioning a node may cause quorum write timeout because messaging backlog to decommissioned node is cleared via Gossiper#removeEndpoint() -> OutboundTcpConnection#closeSocket().
      (Timeout is less likely to happen with RF=3, because we can afford one less response)

      What happened:
      1. [WriteStage] before the leaving node is removed from tokenmetadata, the write endpoints are generated ( leaving endpoint is included )
      2. [GossipStage] the leaving node is removed from tokenmetadata, no more future write handler will include leaving endpoints
      3. [WriteStage] write handlers sends messages to messaging-service backlog
      4. [GossipStage] messaging-service backlog is cleared, messages are not sent and connection closed
      5. [WriteStage] write time out
       
      patch
      3.0
      3.11

      We can avoid it by delaying to destroy messaging connection so that messages are sent and responded. This patch also avoids reopening already closed connection on MessagingService#convict().
      New messaging framework rewrite in Trunk avoids the issues by not clearing messaging backlog.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jasonstack Zhao Yang
            Aleksei Zotov, Brandon Williams, Stefan Miklosovic
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: