Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6216

[C++] Allow user to select the compression level

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.15.0
    • Component/s: C++
    • Flags:
      Patch

      Description

      The compression level selected in Arrow for ZSTD is 1 which is the minimal compression level for the compressor. This leads to very high compression speed at the sacrifice of compression ratio.

      The user should be allowed to select the compression level as both speed and ratio are data specific.

      The proposed solution is to expose the knob via an environment variable such as ARROW_ZSTD_COMPRESSION_LEVEL.
      Example:
      export ARROW_ZSTD_COMPRESSION_LEVEL=10
      ./my_parquet_app

      Here is a test run with compression levels of 1, 2 and 5:
      Level   Time (s)   Size (mb)
      1          13.02       181
      2          13.10       177
      5          19.44       148

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                martinradev Martin Radev
                Reporter:
                martinradev Martin Radev
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 2h Original Estimate - 2h
                  2h
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 13h 40m
                  13h 40m