Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6689 [Rust] [DataFusion] Query execution enhancements for 1.0.0 release
  3. ARROW-6690

[Rust] [DataFusion] HashAggregate without GROUP BY should use SIMD

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Currently the implementation of HashAggregate in the new physical plan uses the same logic regardless of whether a grouping expression is used.

      For the case where there is no grouping expression, such as "SELECT SUM(a) FROM b" we can use the compute kernels to perform an aggregate operation on each batch rather than iterating over each row and accumulating individual values.

      This optimization already exists in the original implementation of aggregate queries direct from the logical plan.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            andygrove Andy Grove Assign to me
            andygrove Andy Grove
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 50m
              50m

              Slack

                Issue deployment