Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12076

Potential performance improvement using ZSTD's ZSTD_decompressDCtx interface

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.3.0
    • Impala 4.4.0
    • Backend
    • None
    • ghx-label-12

    Description

      In ORC-639, they note that ZSTD's simple interface initializes the context on each call to ZSTD_decompress(). When calling ZSTD_decompress() many times, it is better to allocate the context once and use the ZSTD_decompressDCtx() interface to avoid the repeated initialization.

      The ZSTD code mentions that here:

       

      /*= Decompression context
       *  When decompressing many times,
       *  it is recommended to allocate a context only once,
       *  and re-use it for each successive compression operation.
       *  This will make workload friendlier for system's memory.
       *  Use one context per thread for parallel execution. */
      typedef struct ZSTD_DCtx_s ZSTD_DCtx;

      We should investigate using this for decompress.h/.cc's ZstandardDecompressor. We already do that for the streaming decompression mode, but this should also apply to block decompression. Something similar is possible for compression as well.

       

      Attachments

        Activity

          People

            joemcdonnell Joe McDonnell
            joemcdonnell Joe McDonnell
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: