Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15379

Flush with fast compressors by default

    XMLWordPrintableJSON

Details

    • Performance
    • Low Hanging Fruit
    • All
    • None
    • Hide

      Tests: Automated unit tests and manual testing.

      Documentation: Included docs of the new options both in the yaml and in the website

       

      After review will do a more thorough manual test and docs.

      Show
      Tests: Automated unit tests and manual testing. Documentation: Included docs of the new options both in the yaml and in the website   After review will do a more thorough manual test and docs.

    Description

      josnyder and I have been testing out CASSANDRA-14482 (Zstd compression) on some of our most dense clusters and have been observing close to 50% reduction in footprint with Zstd on some of our workloads! Unfortunately though we have been running into an issue where the flush might take so long (Zstd is slower to compress than LZ4) that we can actually block the next flush and cause instability.

      Internally we are working around this with a very simple patch which flushes SSTables as the default compression strategy (LZ4) regardless of the table params. This is a simple solution but I think the ideal solution though might be for the flush compression strategy to be configurable separately from the table compression strategy (while defaulting to the same thing). Instead of adding yet another compression option to the yaml (like hints and commitlog) I was thinking of just adding it to the table parameters and then adding a default_table_parameters yaml option like:

      # Default table properties to apply on freshly created tables. The currently supported defaults are:
      # * compression       : How are SSTables compressed in general (flush, compaction, etc ...)
      # * flush_compression : How are SSTables compressed as they flush
      # supported
      default_table_parameters:
        compression:
          class_name: 'LZ4Compressor'
          parameters:
            chunk_length_in_kb: 16
        flush_compression:
          class_name: 'LZ4Compressor'
          parameters:
            chunk_length_in_kb: 4
      

      This would have the nice effect as well of giving our configuration a path forward to providing user specified defaults for table creation (so e.g. if a particular user wanted to use a different default chunk_length_in_kb they can do that).

      So the proposed (~mandatory) scope is:

      • Flush with a faster compression strategy

      I'd like to implement the following at the same time:

      • Per table flush compression configuration
      • Ability to default the table flush and compaction compression in the yaml.

      Attachments

        1. 15379_backfill_zstd_level10.png
          215 kB
          Joey Lynch
        2. 15379_backfill_duration_zstd_level10.png
          390 kB
          Joey Lynch
        3. 15379_backfill_drops_zstd_level10.png
          143 kB
          Joey Lynch
        4. 15379_backfill_queueing_zstd_level10.png
          52 kB
          Joey Lynch
        5. 15379_coordinator_zstd_level10.png
          189 kB
          Joey Lynch
        6. 15379_message_drops_zstd_level10.png
          135 kB
          Joey Lynch
        7. 15379_request_queueing_zstd_level10.png
          163 kB
          Joey Lynch
        8. 15379_flush_flamegraph_zstd_level10.png
          64 kB
          Joey Lynch
        9. 15379_concurrent_flushes_zstd_level10.png
          89 kB
          Joey Lynch
        10. 15379_replica_zstd_defaults.png
          164 kB
          Joey Lynch
        11. 15379_coordinator_zstd_defaults.png
          197 kB
          Joey Lynch
        12. 15379_system_zstd_defaults.png
          204 kB
          Joey Lynch
        13. 15379_baseline_flush_trace.png
          127 kB
          Joey Lynch
        14. 15379_candidate_flush_trace.png
          213 kB
          Joey Lynch
        15. 15379_system_defaults.png
          192 kB
          Joey Lynch
        16. 15379_replica_defaults.png
          139 kB
          Joey Lynch
        17. 15379_coordinator_defaults.png
          190 kB
          Joey Lynch

        Activity

          People

            jolynch Joey Lynch
            jolynch Joey Lynch
            Joey Lynch
            Dinesh Joshi
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: