Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3524

Spilling joins unnecessarily process spilled partitions with 0 probe rows

    Details

      Description

      For many join modes, if the spilled partition has no probe rows, we don't need to build the hash table. Currently we always do.

        Issue Links

          Activity

          Hide
          tarmstrong Tim Armstrong added a comment -

          This is probably more useful now with runtime filters - a large build side and small probe side could be the result of effective filters, not just a bad plan.

          Show
          tarmstrong Tim Armstrong added a comment - This is probably more useful now with runtime filters - a large build side and small probe side could be the result of effective filters, not just a bad plan.
          Hide
          twmarshall Thomas Tauber-Marshall added a comment -

          commit 6a9df540967e07b09524268d0cc52b7d10835676
          Author: Thomas Tauber-Marshall <tmarshall@cloudera.com>
          Date: Mon Dec 5 15:37:06 2016 -0800

          IMPALA-3524: Don't process spilled partitions with 0 probe rows

          In the partitioned hash join node, if a spilled partition has no probe
          rows, building the hash table is unnecessary.

          For some build types (right outer, right anti, and full outer), we still
          need to process the build side to output unmatched rows (in this case, all
          rows since there were no probe rows to match).

          Testing: Added some cases to spilling.test. Manually tested these cases
          for performance, and they all show around a 6% improvement.

          Change-Id: I175b32dd9031e51218b38c37693ac3e31dfab47b
          Reviewed-on: http://gerrit.cloudera.org:8080/5389
          Reviewed-by: Jim Apple <jbapple-impala@apache.org>
          Tested-by: Impala Public Jenkins

          Show
          twmarshall Thomas Tauber-Marshall added a comment - commit 6a9df540967e07b09524268d0cc52b7d10835676 Author: Thomas Tauber-Marshall <tmarshall@cloudera.com> Date: Mon Dec 5 15:37:06 2016 -0800 IMPALA-3524 : Don't process spilled partitions with 0 probe rows In the partitioned hash join node, if a spilled partition has no probe rows, building the hash table is unnecessary. For some build types (right outer, right anti, and full outer), we still need to process the build side to output unmatched rows (in this case, all rows since there were no probe rows to match). Testing: Added some cases to spilling.test. Manually tested these cases for performance, and they all show around a 6% improvement. Change-Id: I175b32dd9031e51218b38c37693ac3e31dfab47b Reviewed-on: http://gerrit.cloudera.org:8080/5389 Reviewed-by: Jim Apple <jbapple-impala@apache.org> Tested-by: Impala Public Jenkins

            People

            • Assignee:
              twmarshall Thomas Tauber-Marshall
              Reporter:
              tarmstrong Tim Armstrong
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development