Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6689 [Rust] [DataFusion] Query execution enhancements for 1.0.0 release
  3. ARROW-6690

[Rust] [DataFusion] HashAggregate without GROUP BY should use SIMD

    XMLWordPrintableJSON

    Details

      Description

      Currently the implementation of HashAggregate in the new physical plan uses the same logic regardless of whether a grouping expression is used.

      For the case where there is no grouping expression, such as "SELECT SUM(a) FROM b" we can use the compute kernels to perform an aggregate operation on each batch rather than iterating over each row and accumulating individual values.

      This optimization already exists in the original implementation of aggregate queries direct from the logical plan.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                andygrove Andy Grove
                Reporter:
                andygrove Andy Grove
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m