Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-7356

Add a more ops friendly replace_address flag

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 1.2.17, 2.0.9, 2.1 rc2
    • Component/s: None
    • Labels:

      Description

      Doing a host replacement with cassandra.replace_address works well, but it is operationally difficult because the flag needs clearing once the replace is successful. Most people will launch through some scripts so remembering to clear the flag is a pain. Forgetting means the node won't come up on a restart.

      We should have a flag like cassandra.replace_address_first_boot that works the same as auto_bootstrap/initial_token: it is totally ignored if the node has successfully bootstrapped but on starting from a clean disk it will work as the existing cassandra.replace_address.

      1. 7356_fix_v2.txt
        2 kB
        Tyler Hobbs
      2. 7356_fix.patch
        0.7 kB
        Marcus Eriksson
      3. 7356.txt
        1 kB
        Brandon Williams

        Activity

        Hide
        thobbs Tyler Hobbs added a comment -

        Thanks, v2 patch committed.

        Show
        thobbs Tyler Hobbs added a comment - Thanks, v2 patch committed.
        Hide
        brandon.williams Brandon Williams added a comment -

        +1

        Show
        brandon.williams Brandon Williams added a comment - +1
        Hide
        thobbs Tyler Hobbs added a comment -

        7356_fix_v2 moves the check for replacing after having already bootstrapped instead of removing it entirely. I put this through a few bootstrap, replace_address, and replace_address_first_boot tests with ccm locally.

        Show
        thobbs Tyler Hobbs added a comment - 7356_fix_v2 moves the check for replacing after having already bootstrapped instead of removing it entirely. I put this through a few bootstrap, replace_address, and replace_address_first_boot tests with ccm locally.
        Hide
        brandon.williams Brandon Williams added a comment -

        It's not clear to me why getReplaceAddress wouldn't return null here.

        Show
        brandon.williams Brandon Williams added a comment - It's not clear to me why getReplaceAddress wouldn't return null here.
        Hide
        krummas Marcus Eriksson added a comment -

        This broke replacing a node with the 'old' syntax - we call DD.isReplacing() in a couple of places after we have successfully bootstrapped (SS.handleStateNormal() for example)

        java.lang.RuntimeException: Cannot replace address with a node that is already bootstrapped
        	at org.apache.cassandra.config.DatabaseDescriptor.isReplacing(DatabaseDescriptor.java:727)
        	at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1564)
        	at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1382)
        	at org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1049)
        	at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1272)
        	at org.apache.cassandra.service.StorageService.setTokens(StorageService.java:211)
        	at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:883)
        	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:614)
        	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:504)
        	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
        	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
        	at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
        

        attaching patch which removes the check that throws if we have already bootstrapped

        Show
        krummas Marcus Eriksson added a comment - This broke replacing a node with the 'old' syntax - we call DD.isReplacing() in a couple of places after we have successfully bootstrapped (SS.handleStateNormal() for example) java.lang.RuntimeException: Cannot replace address with a node that is already bootstrapped at org.apache.cassandra.config.DatabaseDescriptor.isReplacing(DatabaseDescriptor.java:727) at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1564) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1382) at org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1049) at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1272) at org.apache.cassandra.service.StorageService.setTokens(StorageService.java:211) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:883) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:614) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:504) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585) attaching patch which removes the check that throws if we have already bootstrapped
        Hide
        brandon.williams Brandon Williams added a comment -

        Committed.

        Show
        brandon.williams Brandon Williams added a comment - Committed.
        Hide
        thobbs Tyler Hobbs added a comment -

        +1

        Show
        thobbs Tyler Hobbs added a comment - +1
        Hide
        brandon.williams Brandon Williams added a comment - - edited

        Patch to enable replace_address_first_boot that you can always pass and decides what to do based on the system table's bootstrapped flag. Also disallows using replace_address with a node that claims to already be bootstrapped, since that's a pretty bad idea.

        Show
        brandon.williams Brandon Williams added a comment - - edited Patch to enable replace_address_first_boot that you can always pass and decides what to do based on the system table's bootstrapped flag. Also disallows using replace_address with a node that claims to already be bootstrapped, since that's a pretty bad idea.

          People

          • Assignee:
            brandon.williams Brandon Williams
            Reporter:
            rlow Richard Low
            Reviewer:
            Tyler Hobbs
            Tester:
            Shawn Kumar
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development