Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3662

Reduce parquet scanner memory usage

    Details

      Description

      After IMPALA-2736 there was an increase in peak memory consumption mostly due to the Parquet scanner.
      In most cases the Parquet scanner ends up buffering more batches than needed.

      In the attached profile the scanner memory increases from 2.17GB to 3.3GB.

      Workarounds
      The following query options may help to reduce scanner memory consumption:

      • Reduce the number of scanner threads (set num_scanner_threads=30)
      • Reduce the batch size (set batch_size=512)

      Of course, increasing the mem limit may also help.

        Attachments

        1. tpch_q14_2.6.txt
          281 kB
          Mostafa Mokhtar
        2. tpch_q14_2.5.txt
          278 kB
          Mostafa Mokhtar

          Activity

            People

            • Assignee:
              kwho Michael Ho
              Reporter:
              mmokhtar Mostafa Mokhtar
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: