Cassandra
  1. Cassandra
  2. CASSANDRA-579

Stream SSTables without Anti-compaction

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Fix Version/s: 0.7 beta 1
    • Component/s: None
    • Labels:
      None

      Description

      The io.Streaming API currently requires a file on disk to stream, which means that bootstrap and repairs need to perform an anti-compaction that writes a bunch of data to disk, only to have it be deleted after the streaming has finished.

      EDIT: Deleted reference to using streaming as a client API: it wouldn't provide enough benefit over using the BMT interface, due to fragility.

        Issue Links

          Activity

          Gavin made changes -
          Link This issue depends upon CASSANDRA-1117 [ CASSANDRA-1117 ]
          Gavin made changes -
          Link This issue depends on CASSANDRA-1117 [ CASSANDRA-1117 ]
          Gavin made changes -
          Workflow patch-available, re-open possible [ 12751995 ] reopen-resolved, no closed status, patch-avail, testing [ 12758069 ]
          Gavin made changes -
          Workflow no-reopen-closed, patch-avail [ 12482952 ] patch-available, re-open possible [ 12751995 ]
          Jonathan Ellis made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Stu Hood made changes -
          Attachment 0005-Add-greater-than-operation-for-sstable-indexes-to-op.patch [ 12447256 ]
          Stu Hood made changes -
          Attachment 0002-Add-recovery-for-non-essential-sstable-components.patch [ 12447253 ]
          Jonathan Ellis made changes -
          Attachment 579-rowsize.txt [ 12447315 ]
          Stu Hood made changes -
          Attachment 0004-Stream-minimal-sections-of-SSTables-without-compacti.patch [ 12447255 ]
          Attachment 0005-Add-greater-than-operation-for-sstable-indexes-to-op.patch [ 12447256 ]
          Stu Hood made changes -
          Attachment 0001-Extract-index-filter-writing-into-IndexWriter.patch [ 12447252 ]
          Attachment 0002-Add-recovery-for-non-essential-sstable-components.patch [ 12447253 ]
          Attachment 0003-Only-send-the-datafile-when-streaming.patch [ 12447254 ]
          Stu Hood made changes -
          Attachment 0001-Extract-index-filter-writing-into-IndexWriter.patch [ 12447088 ]
          Stu Hood made changes -
          Attachment 0002-Add-recovery-for-non-essential-sstable-components.patch [ 12447089 ]
          Stu Hood made changes -
          Attachment 0003-Only-send-the-datafile-when-streaming.patch [ 12447090 ]
          Stu Hood made changes -
          Attachment 0004-Stream-minimal-sections-of-SSTables-without-compacti.patch [ 12447091 ]
          Stu Hood made changes -
          Attachment 0005-Add-greater-than-operation-for-sstable-indexes-to-op.patch [ 12447092 ]
          Stu Hood made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Stu Hood made changes -
          Attachment 0004-Stream-minimal-sections-of-SSTables-without-compacti.patch [ 12447091 ]
          Attachment 0005-Add-greater-than-operation-for-sstable-indexes-to-op.patch [ 12447092 ]
          Stu Hood made changes -
          Attachment 0001-Extract-index-filter-writing-into-IndexWriter.patch [ 12447088 ]
          Attachment 0002-Add-recovery-for-non-essential-sstable-components.patch [ 12447089 ]
          Attachment 0003-Only-send-the-datafile-when-streaming.patch [ 12447090 ]
          Stu Hood made changes -
          Attachment 0001-Extract-index-filter-writing-into-IndexWriter.patch [ 12446188 ]
          Stu Hood made changes -
          Attachment 0002-Add-recovery-for-non-essential-sstable-components.patch [ 12446189 ]
          Stu Hood made changes -
          Attachment 0003-Only-send-the-datafile-when-streaming.patch [ 12446190 ]
          Stu Hood made changes -
          Attachment 0004-Stream-minimal-sections-of-SSTables-without-compacti.patch [ 12446212 ]
          Stu Hood made changes -
          Attachment 0005-Use-index-to-select-minimal-set.patch [ 12446193 ]
          Stu Hood made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Stu Hood made changes -
          Comment [ Oops... just noticed a bug in IncomingStreamReader. I'll repost in an hour or two. ]
          Stu Hood made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Stu Hood made changes -
          Attachment 0004-Stream-minimal-sections-of-SSTables-without-compacti.patch [ 12446212 ]
          Stu Hood made changes -
          Attachment 0004-Stream-minimal-sections-of-SSTables-without-compacti.patch [ 12446192 ]
          Stu Hood made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Stu Hood made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Stu Hood made changes -
          Attachment 0004-Stream-minimal-sections-of-SSTables-without-compacti.patch [ 12446192 ]
          Attachment 0005-Use-index-to-select-minimal-set.patch [ 12446193 ]
          Stu Hood made changes -
          Attachment 0001-Extract-index-filter-writing-into-IndexWriter.patch [ 12446188 ]
          Attachment 0002-Add-recovery-for-non-essential-sstable-components.patch [ 12446189 ]
          Attachment 0003-Only-send-the-datafile-when-streaming.patch [ 12446190 ]
          Stu Hood made changes -
          Link This issue depends on CASSANDRA-1117 [ CASSANDRA-1117 ]
          Stu Hood made changes -
          Summary Add support to io.Streaming API for sending Streams Stream SSTables without Anti-compaction
          Assignee Stu Hood [ stuhood ]
          Stu Hood made changes -
          Priority Major [ 3 ] Critical [ 2 ]
          Jonathan Ellis made changes -
          Link This issue blocks CASSANDRA-749 [ CASSANDRA-749 ]
          Stu Hood made changes -
          Description The io.Streaming API currently requires a file on disk to stream, which means that bootstrap and repairs need to perform an anti-compaction that writes a bunch of data to disk, only to have it be deleted after the streaming has finished.

          Ideally, the Streaming API should allow for streaming from an InputStream (or any other class we think we need to design to make the streaming as efficient as possible). That way, anti-compaction for repair/bootstrap does not perform any writing: it simply streams the relevant portion of the file to the neighbor.

          Additionally, this opens up interesting possibilities, such as providing the Streaming API as a (Java only?) client API. One use case would be for a Hadoop OutputFormat: rather than writing BinaryMemtables, the OutputFormat could literally write an SSTable to the stream. This might require better integration with gossip, to ensure that you aren't writing to the completely wrong node.
          The io.Streaming API currently requires a file on disk to stream, which means that bootstrap and repairs need to perform an anti-compaction that writes a bunch of data to disk, only to have it be deleted after the streaming has finished.

          EDIT: Deleted reference to using streaming as a client API: it wouldn't provide enough benefit over using the BMT interface, due to fragility.
          Jonathan Ellis made changes -
          Fix Version/s 0.7 [ 12314533 ]
          Fix Version/s 0.6 [ 12314361 ]
          Jonathan Ellis made changes -
          Link This issue is blocked by CASSANDRA-705 [ CASSANDRA-705 ]
          Stu Hood made changes -
          Field Original Value New Value
          Priority Minor [ 4 ] Major [ 3 ]
          Stu Hood created issue -

            People

            • Assignee:
              Stu Hood
              Reporter:
              Stu Hood
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development