Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5002

[C++] Implement Hash Aggregation query execution node

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 6.0.0
    • C++

    Description

      Dear all,

      I wonder what the best way forward is for implementing GroupBy kernels. Initially this was part of

      https://issues.apache.org/jira/browse/ARROW-4124

      but is not contained in the current implementation as far as I can tell.

      It seems that the part of group by that just returns indices could be conveniently implemented with the HashKernel. That seems useful in any case. Is that indeed the best way forward/should this be done?

      GroupBy + Aggregate could then either be implemented with that + the Take kernel + aggregation involving more memory copies than necessary though or as part of the aggregate kernel. Probably the latter is preferred, any thoughts on that?

      Am I missing any other JIRAs related to this?

      Best, Philipp.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pcmoritz Philipp Moritz
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: