Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-19334

[Analytics] Upgrade to Cassandra 4.0.12 and remove RowBufferMode and BatchSize options

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • NA
    • Analytics Library
    • None

    Description

      In cassandra-all:4.0.12, improvements were made for the CQLSSTableWriter. The sorted writer now can produce size-capped SSTables. It replaces the need for the unsorted sstable writer, which has to buffer and sort data on flushing. The dataset to write in the spark application is already sorted. By avoiding using the unsorted writer, it prevents wasting CPU time on sorting the sorted data. Since the sorted sstable writer does not need to buffer data, its size estimation is more accurate than the unsorted one, meaning the produced sstables files are closer to the expectation.

      By removing the unsorted sstable writer, it no longer requires the RowBufferMode option.
      By supporting size-capping in sorted writer, it no longer requires the BatchSize option.

      Attachments

        Issue Links

          Activity

            People

              yifanc Yifan Cai
              yifanc Yifan Cai
              Yifan Cai
              Francisco Guerrero
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m