Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-16396

Calcite engine. Allow hash output distribution for aggregations

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      Currently, we allow only single output distribution for aggregates, but looks like if we have hash input distribution and all grouping set contains all of the distribution keys we can make aggregation on remote nodes and produce hash output distribution with the same keys. This will reduce memory consumption on the initiator node and make some other optimizations possible.

      For example, query:

      SELECT t1.aff_key, t2.cnt FROM t1 JOIN (SELECT aff_key, COUNT(*) AS cnt FROM t2 GROUP BY id) AS t2 ON t1.aff_key = t2.aff_key

      Can do colocated join if both tables are colocated on aff_key. Currently, such a query does join on the initiator node.

      The same for set-ops (EXCEPT, INTERSECT).

      Attachments

        Issue Links

          Activity

            People

              alex_pl Aleksey Plekhanov
              alex_pl Aleksey Plekhanov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m