Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11744

Trying to restart a 2.2.5 node, nodetool disablethrift fails

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Duplicate
    • None
    • None
    • None
    • Normal

    Description

      We have a 2.2.5 cluster running in AWS VPC with EBS volumes. Earlier today 3 nodes seem to have gone into a bad state - clients were seeing high latencies when writing to these nodes, and the write to the commitlog on each of these nodes seemed high - more than the relatively low number of iops that AWS allocated to these volumes. While trying to understand the situation we attempted to restart the 3 nodes. We attempted to do a nodetool disablebinary; nodetool disablethrift; nodetool flush. and then stop the process.

      When trying to disablethrift, the following stack trace appeared in the system.log:

      ```
      INFO [RMI TCP Connection(8)-172.26.32.248] 2016-05-10 15:26:58,599 Server.java:218 - Stop listening for CQL clients
      INFO [RMI TCP Connection(10)-172.26.32.248] 2016-05-10 15:27:01,975 ThriftServer.java:142 - Stop listening to thrift clients
      ERROR [RPC-Thread:34] 2016-05-10 15:27:03,794 Message.java:324 - Unexpected throwable while invoking!
      java.lang.NullPointerException: null
      at com.thinkaurelius.thrift.util.mem.Buffer.size(Buffer.java:83) ~[thrift-server-0.3.7.jar:na]
      at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.expand(FastMemoryOutputTransport.java:84) ~[thrift-server-0.3.7.jar:na]
      at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.write(FastMemoryOutputTransport.java:167) ~[thrift-server-0.3.7.jar:na]
      at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) ~[libthrift-0.9.2.jar:0.9.2]
      at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:55) ~[libthrift-0.9.2.jar:0.9.2]
      at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[libthrift-0.9.2.jar:0.9.2]
      at com.thinkaurelius.thrift.Message.invoke(Message.java:314) ~[thrift-server-0.3.7.jar:na]
      at com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) [thrift-server-0.3.7.jar:na]
      at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) [thrift-server-0.3.7.jar:na]
      at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) [thrift-server-0.3.7.jar:na]
      at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) [disruptor-3.0.1.jar:na]
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_60]
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60]
      at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
      ```

      The attached jstack was taken from a node after the above was noticed.

      Attachments

        1. failure.jstack.out
          278 kB
          Peter Norton

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pcn Peter Norton
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: