XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Normal
    • Resolution: Unresolved
    • Fix Version/s: 3.0.x
    • Component/s: None
    • Labels:
    • Severity:
      Normal

      Description

      We have a test that causes an NPE in gossip code:
      It's basically calling nodetool enable/disable gossip

      From the debug log

      WARN [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,423 StorageService.java:395 - Starting gossip by operator request
      DEBUG [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 StorageService.java:1996 - Node /172.31.24.76 state NORMAL, token [-9223372036854775808]
      INFO [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 StorageService.java:1999 - Node /172.31.24.76 state jump to NORMAL
      DEBUG [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 YamlConfigurationLoader.java:102 - Loading settings from file:/mnt/ephemeral/automaton/cassandra-src/conf/cassandra.yaml
      DEBUG [PendingRangeCalculator:1] 2016-05-17 18:58:44,425 PendingRangeCalculatorService.java:66 - finished calculation for 5 keyspaces in 0ms
      DEBUG [GossipStage:1] 2016-05-17 18:58:45,346 FailureDetector.java:456 - Ignoring interval time of 75869093776 for /172.31.31.1
      DEBUG [GossipStage:1] 2016-05-17 18:58:45,347 FailureDetector.java:456 - Ignoring interval time of 75869214424 for /172.31.17.32
      INFO [GossipStage:1] 2016-05-17 18:58:45,347 Gossiper.java:1028 - Node /172.31.31.1 has restarted, now UP
      DEBUG [GossipStage:1] 2016-05-17 18:58:45,347 StorageService.java:1996 - Node /172.31.31.1 state NORMAL, token [-3074457345618258603]
      INFO [GossipStage:1] 2016-05-17 18:58:45,347 StorageService.java:1999 - Node /172.31.31.1 state jump to NORMAL
      INFO [HANDSHAKE-/172.31.31.1] 2016-05-17 18:58:45,348 OutboundTcpConnection.java:514 - Handshaking version with /172.31.31.1
      ERROR [GossipStage:1] 2016-05-17 18:58:45,354 CassandraDaemon.java:195 - Exception in thread Thread[GossipStage:1,5,main]
      java.lang.NullPointerException: null
      at org.apache.cassandra.gms.Gossiper.getHostId(Gossiper.java:846) ~[main/:na]
      at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2008) ~[main/:na]
      at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1729) ~[main/:na]
      at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2446) ~[main/:na]
      at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1050) ~[main/:na]
      at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1133) ~[main/:na]
      at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) ~[main/:na]
      at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) ~[main/:na]
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_40]
      at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_40]
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_40]
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_40]
      at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
      INFO [GossipStage:1] 2016-05-17 18:58:45,355 Gossiper.java:1028 - Node /172.31.17.32 has restarted, now UP
      DEBUG [GossipStage:1] 2016-05-17 18:58:45,355 StorageService.java:1996 - Node /172.31.17.32 state NORMAL, token [3074457345618258602]
      INFO [GossipStage:1] 2016-05-17 18:58:45,356 StorageService.java:1999 - Node /172.31.17.32 state jump to NORMAL
      INFO [HANDSHAKE-/172.31.17.32] 2016-05-17 18:58:45,356 OutboundTcpConnection.java:514 - Handshaking version with /172.31.17.32
      DEBUG [PendingRangeCalculator:1] 2016-05-17 18:58:45,357 PendingRangeCalculatorService.java:66 - finished calculation for 5 keyspaces in 0ms
      DEBUG [GossipStage:1] 2016-05-17 18:58:45,357 MigrationManager.java:94 - Not pulling schema because versions match or shouldPullSchemaFrom returned false
      INFO [GossipStage:1] 2016-05-17 18:58:45,357 TokenMetadata.java:429 - Updating topology for /172.31.17.32
      INFO [GossipStage:1] 2016-05-17 18:58:45,358 TokenMetadata.java:429 - Updating topology for /172.31.17.32
      DEBUG [SharedPool-Worker-1] 2016-05-17 18:58:45,358 Gossiper.java:993 - removing expire time for endpoint : /172.31.17.32
      INFO [SharedPool-Worker-1] 2016-05-17 18:58:45,358 Gossiper.java:994 - InetAddress /172.31.17.32 is now UP
      DEBUG [SharedPool-Worker-1] 2016-05-17 18:58:45,358 MigrationManager.java:94 - Not pulling schema because versions match or shouldPullSchemaFrom returned false
      DEBUG [GossipStage:1] 2016-05-17 18:58:45,358 MigrationManager.java:94 - Not pulling schema because versions match or shouldPullSchemaFrom returned false
      DEBUG [SharedPool-Worker-2] 2016-05-17 18:58:45,360 Gossiper.java:993 - removing expire time for endpoint : /172.31.31.1
      DEBUG [SharedPool-Worker-1] 2016-05-17 18:58:45,360 Gossiper.java:993 - removing expire time for endpoint : /172.31.31.1
      INFO [SharedPool-Worker-2] 2016-05-17 18:58:45,360 Gossiper.java:994 - InetAddress /172.31.31.1 is now UP
      INFO [SharedPool-Worker-1] 2016-05-17 18:58:45,360 Gossiper.java:994 - InetAddress /172.31.31.1 is now UP
      WARN [GossipTasks:1] 2016-05-17 18:58:45,429 FailureDetector.java:287 - Not marking nodes down due to local pause of 75131216102 > 5000000000
      DEBUG [GossipTasks:1] 2016-05-17 18:58:45,429 FailureDetector.java:293 - Still not marking nodes down due to local pause
      INFO [HANDSHAKE-/172.31.31.1] 2016-05-17 18:58:45,431 OutboundTcpConnection.java:514 - Handshaking version with /172.31.31.1
      DEBUG [GossipTasks:1] 2016-05-17 18:58:46,429 FailureDetector.java:293 - Still not marking nodes down due to local pause
      DEBUG [GossipTasks:1] 2016-05-17 18:58:46,429 FailureDetector.java:293 - Still not marking nodes down due to local pause
      DEBUG [GossipTasks:1] 2016-05-17 18:58:47,430 FailureDetector.java:293 - Still not marking nodes down due to local pause
      DEBUG [GossipTasks:1] 2016-05-17 18:58:47,430 FailureDetector.java:293 - Still not marking nodes down due to local pause
      DEBUG [GossipTasks:1] 2016-05-17 18:58:48,430 FailureDetector.java:293 - Still not marking nodes down due to local pause
      DEBUG [GossipTasks:1] 2016-05-17 18:58:48,430 FailureDetector.java:293 - Still not marking nodes down due to local pause

        Attachments

          Activity

            People

            • Assignee:
              jkni Joel Knighton
              Reporter:
              tjake T Jake Luciani
              Authors:
              Joel Knighton
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated: