Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13241

Lower default chunk_length_in_kb from 64kb to 16kb

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 4.0-alpha1, 4.0
    • Legacy/Core
    • None

    Description

      Having a too low chunk size may result in some wasted disk space. A too high chunk size may lead to massive overreads and may have a critical impact on overall system performance.

      In my case, the default chunk size lead to peak read IOs of up to 1GB/s and avg reads of 200MB/s. After lowering chunksize (of course aligned with read ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s.

      The risk of (physical) overreads is increasing with lower (page cache size) / (total data size) ratio.

      High chunk sizes are mostly appropriate for bigger payloads pre request but if the model consists rather of small rows or small resultsets, the read overhead with 64kb chunk size is insanely high. This applies for example for (small) skinny rows.

      Please also see here:
      https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY

      To give you some insights what a difference it can make (460GB data, 128GB RAM):

      Attachments

        1. CompactIntegerSequenceBench.java
          4 kB
          Ariel Weisberg
        2. CompactIntegerSequence.java
          5 kB
          Ariel Weisberg
        3. CompactSummingIntegerSequence.java
          5 kB
          Ariel Weisberg

        Issue Links

          Activity

            People

              aweisberg Ariel Weisberg
              brstgt Benjamin Roth
              Ariel Weisberg
              Jon Haddad
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: