Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-19369

[Analytics] Use XXHash32 for digest calculation of SSTables

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • NA
    • Analytics Library
    • None

    Description

      During bulk writes, Cassandra Analytics calculates the MD5 checksum of every SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra Analytics includes the content-md5 header as part of the upload request. This information is used by Cassandra Sidecar to validate the integrity of the uploaded SSTable and prevent issues with bit flips and corrupted SSTables.

      Recently, Cassandra Sidecar introduced support for additional checksum validations during SSTable upload. Notably the XXHash32 digest support was added which offers for more performant checksum calculations. This support now allows Cassandra Analytics to use a more efficient digest algorithm that is friendlier on the CPU usage of Sidecar and spark resources.

      Attachments

        Issue Links

          Activity

            People

              frankgh Francisco Guerrero
              frankgh Francisco Guerrero
              Francisco Guerrero
              Yifan Cai
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m