Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
The compression level selected in Arrow for ZSTD is 1 which is the minimal compression level for the compressor. This leads to very high compression speed at the sacrifice of compression ratio.
The user should be allowed to select the compression level as both speed and ratio are data specific.
The proposed solution is to expose the knob via an environment variable such as ARROW_ZSTD_COMPRESSION_LEVEL.
Example:
export ARROW_ZSTD_COMPRESSION_LEVEL=10
./my_parquet_app
Here is a test run with compression levels of 1, 2 and 5:
Level Time (s) Size (mb)
1 13.02 181
2 13.10 177
5 19.44 148
Attachments
Issue Links
- is related to
-
ARROW-2290 [C++/Python] Add ability to set codec options for lz4 codec
- Closed
- links to