[ARROW-6690] [Rust] [DataFusion] HashAggregate without GROUP BY should use SIMD - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.16.0
Component/s: Rust, Rust - DataFusion
Labels:
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/23037

Description

Currently the implementation of HashAggregate in the new physical plan uses the same logic regardless of whether a grouping expression is used.

For the case where there is no grouping expression, such as "SELECT SUM(a) FROM b" we can use the compute kernels to perform an aggregate operation on each batch rather than iterating over each row and accumulating individual values.

This optimization already exists in the original implementation of aggregate queries direct from the logical plan.

Attachments

Issue Links

links to

GitHub Pull Request #5606

Activity

People

Assignee:: Andy Grove

Reporter:: Andy Grove

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 25/Sep/19 13:58

Updated:: 11/Jan/23 07:48

Resolved:: 12/Oct/19 16:35

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

50m