Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9403

tune BufferedChecksum.DEFAULT_BUFFERSIZE

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 9.0
    • None
    • None
    • New

    Description

      This is currently set to 256 bytes, so that's the amount of data we pass to crc.update() at once.

      I tried different sizes with https://github.com/benalexau/hash-bench and JDK14:

      HashBench.withArray         crc32-jre       256  avgt    5   81.349 ±  8.364  ns/op
      HashBench.withArray         crc32-jre       512  avgt    5   95.204 ± 10.057  ns/op
      HashBench.withArray         crc32-jre      1024  avgt    5  120.081 ±  8.471  ns/op
      HashBench.withArray         crc32-jre      2048  avgt    5  173.505 ±  8.857  ns/op
      HashBench.withArray         crc32-jre      8192  avgt    5  487.721 ± 11.435  ns/op
      

      based on this let's bump the buffersize from 256 to 1024? I think we want to avoid huge buffers but still keep the CPU overhead low. It only impacts ChecksumIndexInputs (e.g. speed of checkIntegrity() calls at merge) because IndexOutputs do not need this buffer.

      Attachments

        1. LUCENE-9403.patch
          0.8 kB
          Robert Muir

        Activity

          People

            Unassigned Unassigned
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: