Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11273

Unknown column failure during bootstrap

    XMLWordPrintableJSON

Details

    • Normal

    Description

      When running bootstrap on a new node, the following problem can occur because Cassandra fails to recognize columns for some reason. The error prevents the bootstrap from finishing and hangs the bootstrap. If the bootstrap is resumed, it will get the same error and bootstrap cannot be completed. The workaround that I used is at the end.

      from 192.168.10.8

      ERROR [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamSession.java:635 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Remote peer 192.168.10.10 failed stream session.
      INFO [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamResultFuture.java:182 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Session with /192.168.10.10 is complete
      WARN [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,858 StreamResultFuture.java:209 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Stream failed

      from 192.168.10.8 debug

      DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,414 ConnectionHandler.java:262 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Received Received (79256340-bbbb-11e5-9f70-7d76a8de8480, #0)
      DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,854 ConnectionHandler.java:262 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Received Retry (f3a137e0-024b-11e5-bb31-0d2316086bf7, #0)
      DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 ConnectionHandler.java:334 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Sending File (Header (cfId: f3a137e0-024b-11e5-bb31-0d2316086bf7, #0, version: ma, format: BIG, estimated keys: 128, transfer size: 4653, compressed?: true, repairedAt: 0, level: 0), file: /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db)
      DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 CompressedStreamWriter.java:63 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Start streaming file /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db to /192.168.10.10, repairedAt = 0, totalSize = 4653
      DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 CompressedStreamWriter.java:94 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Finished streaming file /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db to /192.168.10.10, bytesTransferred = 4653, totalSize = 4653
      DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,855 ConnectionHandler.java:262 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Received Retry (faa55490-024b-11e5-bb31-0d2316086bf7, #0)
      DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,855 ConnectionHandler.java:334 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Sending File (Header (cfId: faa55490-024b-11e5-bb31-0d2316086bf7, #0, version: ma, format: BIG, estimated keys: 128, transfer size: 705, compressed?: true, repairedAt: 0, level: 0), file: /home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db)
      DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,856 CompressedStreamWriter.java:63 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Start streaming file /home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db to /192.168.10.10, repairedAt = 0, totalSize = 705
      DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,856 CompressedStreamWriter.java:94 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Finished streaming file /home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db to /192.168.10.10, bytesTransferred = 705, totalSize = 705
      DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 ConnectionHandler.java:262 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Received Session Failed
      ERROR [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamSession.java:635 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Remote peer 192.168.10.10 failed stream session.
      DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 ConnectionHandler.java:110 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Closing stream connection handler on /192.168.10.10
      INFO [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamResultFuture.java:182 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Session with /192.168.10.10 is complete
      WARN [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,858 StreamResultFuture.java:209 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Stream failed

      from 192.168.10.10

      [2016-02-27 20:37:53,413] received file /home/cassandra/data/sensordb/listedAttributes-79256340bbbb11e59f707d76a8de8480/ma-32-big-Data.db (progress: 365%)
      [2016-02-27 20:37:53,414] received file /home/cassandra/data/sensordb/liestedAttributes-79256340bbbb11e59f707d76a8de8480/ma-32-big-Data.db (progress: 369%)
      [2016-02-27 20:37:53,865] session with /192.168.10.8 complete (progress: 369%)
      [2016-02-27 20:37:53,866] Stream failed

      from 192.168.10.10 debug

      DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,201 CompressedStreamReader.java:80 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 166627, ks = 'sensordb', table = 'listAttributes'.
      DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,412 CompressedStreamReader.java:110 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Finished receiving file #0 from /192.168.10.8 readBytes = 166627, totalSize = 166627
      DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,412 ConnectionHandler.java:262 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Received File (Header (cfId: 79256340-bbbb-11e5-9f70-7d76a8de8480, #0, version: ma, format: BIG, estimated keys: 128, transfer size: 166627, compressed?: true, repairedAt: 0, level: 0), file: /home/cassandra/data/sensordb/listAttributes-79256340bbbb11e59f707d76a8de8480/ma-32-big-Data.db)
      DEBUG [STREAM-OUT-/192.168.10.8] 2016-02-27 20:37:53,412 ConnectionHandler.java:334 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Sending Received (79256340-bbbb-11e5-9f70-7d76a8de8480, #0)
      DEBUG [CompactionExecutor:3] 2016-02-27 20:37:53,833 CompactionTask.java:217 - Compacted (e224bef0-ddbb-11e5-80c0-89f591237aca) 4 sstables to [/home/cassandra/data/system_distributed/parent_repair_history-deabd734b99d3b9c92e5fd92eb5abf14/ma-5-big,] to level=0. 2,743,164 bytes to 685,791 (~25% of original) in 1,096ms = 0.596735MB/s. 0 total partitions merged to 57. Partition merge counts were

      {4:57, }

      DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,850 CompressedStreamReader.java:80 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 4653, ks = 'sensordb', table = 'sensor'.
      WARN [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,851 StreamSession.java:641 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Retrying for following error
      java.lang.RuntimeException: Unknown column lastEvaluation during deserialization
      at org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331) ~[apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:87) ~[apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50) [apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39) [apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59) [apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261) [apache-cassandra-3.0.3.jar:3.0.3]
      at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]
      DEBUG [STREAM-OUT-/192.168.10.8] 2016-02-27 20:37:53,852 ConnectionHandler.java:334 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Sending Retry (f3a137e0-024b-11e5-bb31-0d2316086bf7, #0)
      DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,852 ConnectionHandler.java:262 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Received null
      DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,853 CompressedStreamReader.java:80 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 705, ks = 'sensordb', table = 'sensorUnit'.
      WARN [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,854 StreamSession.java:641 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Retrying for following error
      java.lang.RuntimeException: Unknown column lastCheckTime during deserialization
      at org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331) ~[apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:87) ~[apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50) [apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39) [apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59) [apache-cassandra-3.0.3.jar:3.0.3]
      at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261) [apache-cassandra-3.0.3.jar:3.0.3]
      at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]
      DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,854 ConnectionHandler.java:262 - Stream #c9868f90-ddbb-11e5-80c0-89f591237aca Received null

      Possible Workaround

      To resolve this, it is possible to do the following:

      1) in cqlsh on the new node
      use system;
      select host_id from local
      2) Save that host_id uuid for later use
      3) Change the cassandra.yaml to set auto_bootstrap to false
      4) Stop the database on the new node
      5) Remove all the contents of the data directory on the new node
      6) Copy all files from the data directory on an existing replica node to the data directory on new node
      7) Start Cassandra on the new node in network isolation or restart Cassandra on the other nodes in the cluster after starting the new node
      8) In cqlsh on the new node
      use system;
      update local set host_id=<host id saved previously>,tokens=null where key='local';
      update local set broadcast_address='<local IP>',listen_address='<local IP>',rpc_address='<local IP>' where key='local';
      9) On the new node run the following to save the updated system data
      nodetool flush system local
      10) Restart cassandra on the new node
      11) Run the following on the new node to generate the data tokens
      nodetool repair

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              longtimer Jason Kania
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: