Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3200 Replace BufferedBlockMgr with new buffer pool
  3. IMPALA-2708

Partitioned aggregation node repartitions when spilled partition could fit in memory

    Details

      Description

      The partitioned aggregation node always repartitions spilled partitions. This often doesn't make sense, because if only a small number of partitions were spilled, it's likely that a single partition will fit easily in memory. Instead it should check to see if the partition is likely to fit in memory and if so, just pin aggregated_row_stream, rebuild the hash table, and reprocess unaggregated_row_stream. The partitioned hash join node already does the equivalent thing.

      Changing this would improve performance for spilled aggregations by avoiding unnecessary repartitioning. It would also solve a corner case where Impala gives up on repartitioning despite the partition fitting in memory (see IMPALA-2676).

      There is a TODO in the code in PartitionedAggregationNode::NextPartition() but creating a JIRA to track the issue.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                tarmstrong Tim Armstrong
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: