Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-16349

SSTableLoader reports error when SSTable(s) do not have data for some nodes

    XMLWordPrintableJSON

Details

    Description

      Running SSTableLoader in verbose mode will show error(s) if there are node(s) that do not own any data from the SSTable(s). This can happen in at least 2 cases:

      1. SSTableLoader is used to stream backups while keeping the same token ranges
      2. SSTable(s) are created with CQLSSTableWriter to match token ranges (this can bring better performance by using ZeroCopy streaming)

      Partial output of the SSTableLoader:

      ERROR 02:47:47,842 Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47 Remote peer /127.0.0.4:7000 failed stream session.

      ERROR 02:47:47,842 Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47 Remote peer /127.0.0.3:7000 failed stream session.

      progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 0.000KiB/s (avg: 1.611KiB/s)

      progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 0.000KiB/s (avg: 1.611KiB/s)

      progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 0.000KiB/s (avg: 1.515KiB/s)

      progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 0.000KiB/s (avg: 1.427KiB/s)

       

      Stack trace:

      java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed

      at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)

      at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)

      at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)

      at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)

      Caused by: org.apache.cassandra.streaming.StreamException: Stream failed

      at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)

      at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)

      at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)

      at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)

      at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)

      at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)

      at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)

      at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)

      at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)

      at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)

      at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)

      at org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)

      at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

      at java.base/java.lang.Thread.run(Thread.java:844)

      To reproduce create a cluster with ccm with more nodes than the RF, put some data into it copy a SSTable and stream it.

       

      The error originates on the nodes, the following stack trace is shown in the logs:

      java.lang.IllegalStateException: Stream hasn't been read yet

              at com.google.common.base.Preconditions.checkState(Preconditions.java:507)

              at org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)

              at org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)

              at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)

              at org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)

              at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

              at java.base/java.lang.Thread.run(Thread.java:844)

       

      An error is thrown due to stream size being read before any data was received. The solution would be not to stream this at all; SSTableLoader.java already looks into each SSTable to determine what parts of it will map to each node token ranges. 

      Attachments

        Issue Links

          Activity

            People

              serban Serban Teodorescu
              serban Serban Teodorescu
              Alex Sorokoumov, Serban Teodorescu
              Alex Sorokoumov, Marcus Eriksson, Zhao Yang
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h