Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16998

replace_address does not work in 3.11.10

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Duplicate
    • None
    • None
    • None
    • All
    • None

    Description

      We have a 30 node setup with four DCs. In one DC we had a failed node (cass04). We built a new node, same version of cass. Same rackdc as the failed node, used the same IP as the failed node, and added replace_address=<ip of cass04>.

      The node got to joining, then exited with something about can't contact any seeds. All of the seed nodes had the following in their logs:

      WARN [MigrationStage:1] 2021-09-27 09:46:34,806 MigrationCoordinator.java:426 - Can't send schema pull request: node /10.10.4.124 is down.

      I watched the failuredetector on the seed nodes and it went to zero when the new cass04 started coming up, so they knew it was up. My guess is they were refusing to send because gossip said cass04 was down.

      I tried changing the IP to a different IP and used replace_address with the IP of the failed node, and the replacement node kept complaining that it could not get the schema from the failed node. It seems this has been fixed in 3.11.11

      So in this situation, what's the best way to replace a failed node in 3.11.10? nodetool removenode of the dead node?

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              seantfulton Sean Fulton
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: