Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-13075

Setting very high BATCH_SIZE can blow up memory usage of fragments

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 4.0.0
    • None
    • Backend
    • None
    • ghx-label-8

    Description

      In Impala 4.0, setting a very high BATCH_SIZE or near max limit of 65536 can cause some fragment's memory usage to spike way past the query's defined MEM_LIMIT or pool's Maximum Query Memory Limit with Clamp on. So even though MEM_LIMIT is set reasonable, the query can still fail with out of memory and a huge amount of memory used on fragment. Reducing BATCH_SIZE to a reasonable amount or back to default will allow the query to run without issue and use reasonable amount of memory within query's MEM_LIMIT or pool's Maximum Query Memory Limit.

       

      1) set BATCH_SIZE=65536; set MEM_LIMIT=1g;

       

          Query State: EXCEPTION
          Impala Query State: ERROR
          Query Status: Memory limit exceeded: Error occurred on backend ...:27000 by fragment ... Memory left in process limit: 145.53 GB Memory left in query limit: -6.80 GB Query(...): memory limit exceeded. Limit=1.00 GB Reservation=86.44 MB ReservationLimit=819.20 MB OtherMemory=7.71 GB Total=7.80 GB Peak=7.84 GB   Unclaimed reservations: Reservation=8.50 MB OtherMemory=0 Total=8.50 MB Peak=56.44 MB   Runtime Filter Bank: Reservation=4.00 MB ReservationLimit=4.00 MB OtherMemory=0 Total=4.00 MB Peak=4.00 MB   Fragment ...: Reservation=1.94 MB OtherMemory=7.59 GB Total=7.59 GB Peak=7.63 GB     HASH_JOIN_NODE (id=8): Reservation=1.94 MB OtherMemory=7.57 GB Total=7.57 GB Peak=7.57 GB       Exprs: Total=7.57 GB Peak=7.57 GB       Hash Join Builder (join_node_id=8): Total=0 Peak=1.95 MB
      ...
          Query Options (set by configuration): BATCH_SIZE=65536,MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell v4.0.0.7.2.16.0-287 (5ae3917) built on Mon Jan  9 21:23:59 UTC 2023,DEFAULT_FILE_FORMAT=PARQUET,...
      ...
         ExecSummary:
      ...
      09:AGGREGATE                    32     32    0.000ns    0.000ns        0       4.83M   36.31 MB      212.78 MB  STREAMING                                 
      08:HASH JOIN                    32     32    5s149ms      2m44s        0     194.95M    7.57 GB        1.94 MB  RIGHT OUTER JOIN, PARTITIONED
      |--18:EXCHANGE                  32     32   93.750us    1.000ms   10.46K       1.55K    1.65 MB        2.56 MB  HASH(...
      

       

       

      2) set BATCH_SIZE=0; set MEM_LIMIT=1g;

       

          Query State: FINISHED
          Impala Query State: FINISHED
      ...
          Query Options (set by configuration and planner): MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell v4.0.0.7.2.16.0-287 (5ae3917) built on Mon Jan  9 21:23:59 UTC 2023,DEFAULT_FILE_FORMAT=PARQUET,...
      ...
          ExecSummary:
      ...
      09:AGGREGATE                    32     32  593.748us   18.999ms       45       4.83M    34.06 MB      212.78 MB  STREAMING
      08:HASH JOIN                    32     32   10s873ms      5m47s   10.47K     194.95M   123.48 MB        1.94 MB  RIGHT OUTER JOIN, PARTITIONED
      |--18:EXCHANGE                  32     32    0.000ns    0.000ns   10.46K       1.55K   344.00 KB        1.69 MB  HASH(...
      

       

      Attachments

        Activity

          People

            rizaon Riza Suminto
            ezerihun Ezra Zerihun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: