Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3200 Replace BufferedBlockMgr with new buffer pool
  3. IMPALA-2708

Partitioned aggregation node repartitions when spilled partition could fit in memory

    XMLWordPrintableJSON

Details

    Description

      The partitioned aggregation node always repartitions spilled partitions. This often doesn't make sense, because if only a small number of partitions were spilled, it's likely that a single partition will fit easily in memory. Instead it should check to see if the partition is likely to fit in memory and if so, just pin aggregated_row_stream, rebuild the hash table, and reprocess unaggregated_row_stream. The partitioned hash join node already does the equivalent thing.

      Changing this would improve performance for spilled aggregations by avoiding unnecessary repartitioning. It would also solve a corner case where Impala gives up on repartitioning despite the partition fitting in memory (see IMPALA-2676).

      There is a TODO in the code in PartitionedAggregationNode::NextPartition() but creating a JIRA to track the issue.

      Attachments

        Issue Links

          Activity

            People

              tarmstrong Tim Armstrong
              tarmstrong Tim Armstrong
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: