Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.3.0
-
None
-
ghx-label-12
Description
In ORC-639, they note that ZSTD's simple interface initializes the context on each call to ZSTD_decompress(). When calling ZSTD_decompress() many times, it is better to allocate the context once and use the ZSTD_decompressDCtx() interface to avoid the repeated initialization.
The ZSTD code mentions that here:
/*= Decompression context * When decompressing many times, * it is recommended to allocate a context only once, * and re-use it for each successive compression operation. * This will make workload friendlier for system's memory. * Use one context per thread for parallel execution. */ typedef struct ZSTD_DCtx_s ZSTD_DCtx;
We should investigate using this for decompress.h/.cc's ZstandardDecompressor. We already do that for the streaming decompression mode, but this should also apply to block decompression. Something similar is possible for compression as well.