Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2167

Remove the old (unpartitioned) HJ and AGG nodes

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.2
    • Fix Version/s: Impala 2.10.0
    • Component/s: Backend
    • Labels:
      None

      Description

      Currently we maintain two version of the hash-based aggregations and joins, the (old) unpartitioned ones and the partitioned and spillable ones. The main reason we had to keep the old version it was because of the additional memory PAGG and PHJ were consuming in small-ish aggregations and joins.

      But maintaining this extra code is cumbersome, error-prone and tricky to test. For example, the new PHJ supports functionality (join modes) that the old one does not support, which means that some times even though we disable PHJ we still use it, see IMPALA-1751.

      If we manage to make PAGG and PHJ to consume as much memory as their unpartitioned counterparts in small-ish inputs (or a few MBs more) then there is no reason we should keep the old AGG and HJ nodes around.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                ippokratis Ippokratis Pandis
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: