[ARROW-5002] [C++] Implement Hash Aggregation query execution node - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: 6.0.0
Component/s: C++
Labels:
- query-engine

External issue URL:
https://github.com/apache/arrow/issues/21501

Description

Dear all,

I wonder what the best way forward is for implementing GroupBy kernels. Initially this was part of

https://issues.apache.org/jira/browse/ARROW-4124

but is not contained in the current implementation as far as I can tell.

It seems that the part of group by that just returns indices could be conveniently implemented with the HashKernel. That seems useful in any case. Is that indeed the best way forward/should this be done?

GroupBy + Aggregate could then either be implemented with that + the Take kernel + aggregation involving more memory copies than necessary though or as part of the aggregate kernel. Probably the latter is preferred, any thoughts on that?

Am I missing any other JIRAs related to this?

Best, Philipp.

Attachments

Issue Links

blocks

ARROW-3120 [C++] Parallelize execution of ScalarAggregateFunction

Open

depends upon

ARROW-12759 [C++][Compute] Wrap grouped aggregation in an ExecNode

Resolved

is duplicated by

ARROW-6733 cpp implementation for hash aggregation and binding in python

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Philipp Moritz

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 25/Mar/19 02:05

Updated:: 11/Jan/23 07:37

Resolved:: 03/Aug/21 18:13