Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-17740

BulkLoader tool initializes schema unnecessarily via streaming

    XMLWordPrintableJSON

Details

    Description

      Changes to streaming setup code for CASSANDRA-17663 mean that BulkLoader initializes the schema/system keyspace, which is not what we want in a tool. Initialization is due to a call to SystemKeyspace.getPreferredIP from the BulkLoader when it starts to transmit the SSTables from the bulk loader to the Cassandra instance.

      getPreferredIP:1063, SystemKeyspace (org.apache.cassandra.db)
      sendMessage:213, StreamingMultiplexedChannel (org.apache.cassandra.streaming.async)
      sendControlMessage:191, StreamingMultiplexedChannel (org.apache.cassandra.streaming.async)
      sendControlMessage:1033, StreamSession (org.apache.cassandra.streaming)
      startStreamingFiles:1257, StreamSession (org.apache.cassandra.streaming)
      prepareSynAck:802, StreamSession (org.apache.cassandra.streaming)
      messageReceived:622, StreamSession (org.apache.cassandra.streaming)
      run:76, StreamDeserializingTask (org.apache.cassandra.streaming)
      run:30, FastThreadLocalRunnable (io.netty.util.concurrent)
      run:748, Thread (java.lang)
      

      The existing BulkLoaderTest fails to detect this as it doesn't actually connect to anything so does not initialize streaming.

      Affects 4.1 and trunk, and may affect 4.0, although the 4.0 patch for CASSANDRA-17663 is different than 4.1+, and may require different mitigation.

      Attachments

        Issue Links

          Activity

            People

              maedhroz Caleb Rackliffe
              maedhroz Caleb Rackliffe
              Caleb Rackliffe
              Jon Meredith
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 10m
                  2h 10m