Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.0.0
    • Component/s: Core
    • Labels:
      None

      Description

      Streaming currently is a two-pass operation: one to write the Data component do disk from the socket, then another to build the index and bloom filter from it. This means we do about 2x the i/o we would if we created the index and BF during the original write.

      For node movement this was not considered to be a Big Deal because the stream target is not a member of the ring, so we can be inefficient without hurting live queries. But optimizing node movement to not require un/rebootstrap (CASSANDRA-1427) and bulk load (CASSANDRA-1278) mean we can stream to live nodes too.

      The main obstacle here is we don't know how many keys will be in the new sstable ahead of time, which we need to size the bloom filter correctly. We can solve this by including that information (or a close approximation) in the stream setup – the source node can calculate that without hitting disk from the in-memory index summary.

        Activity

        Hide
        Jonathan Ellis added a comment -

        Committed. Nice work, Yuki!

        Show
        Jonathan Ellis added a comment - Committed. Nice work, Yuki!
        Hide
        Yuki Morishita added a comment -

        Attached patch let cassandra create sstable with indices and BF directly from streaming.
        I left the old path to handle the case where older version of node streams to the new one.
        I don't have test environment with SSL, so testing with encryption enabled environment is appreciated.

        Show
        Yuki Morishita added a comment - Attached patch let cassandra create sstable with indices and BF directly from streaming. I left the old path to handle the case where older version of node streams to the new one. I don't have test environment with SSL, so testing with encryption enabled environment is appreciated.
        Hide
        Jonathan Ellis added a comment -

        Moving to 1.0 b/c of CASSANDRA-2818.

        Show
        Jonathan Ellis added a comment - Moving to 1.0 b/c of CASSANDRA-2818 .
        Hide
        Jonathan Ellis added a comment -

        the source node can calculate that without hitting disk from the in-memory index summary

        (referring to SSTableReader.indexSummary)

        Show
        Jonathan Ellis added a comment - the source node can calculate that without hitting disk from the in-memory index summary (referring to SSTableReader.indexSummary)
        Hide
        Jonathan Ellis added a comment -

        The javadoc for the StreamOut class has a good overview of the streaming [file transfer] process.

        Show
        Jonathan Ellis added a comment - The javadoc for the StreamOut class has a good overview of the streaming [file transfer] process.
        Hide
        Jonathan Ellis added a comment -

        one to write the Data component to disk from the socket

        (IncomingTcpConnection.stream)

        another to build the [row] index and bloom filter from it

        (StreamInSession.finished / CompactionManager.instance.submitSSTableBuild – this is NOT talking about the buildSecondaryIndexes pass for column indexes, which we can't optimize away... yet)

        Show
        Jonathan Ellis added a comment - one to write the Data component to disk from the socket (IncomingTcpConnection.stream) another to build the [row] index and bloom filter from it (StreamInSession.finished / CompactionManager.instance.submitSSTableBuild – this is NOT talking about the buildSecondaryIndexes pass for column indexes, which we can't optimize away... yet)

          People

          • Assignee:
            Yuki Morishita
            Reporter:
            Jonathan Ellis
            Reviewer:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development