Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10099

Push down DISTINCT aggregation for EXCEPT/INTERSECT

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • None
    • None
    • ghx-label-13

    Description

      The implementation of SetOperations for EXCEPT/INTERSECT in IMPALA-9943 produced query rewrites that would apply DISTINCT aggregation after exchanges for distributed plans. In case where the query can be directly rewritten to apply the DISTINCT to the set operation operands would result in better performance for most large queries.

      This should help the performance TPC-DS Q14 which does an INTERSECT of queries with large result sets that contain many duplicates.

      In general it would better to have DISTINCT move around optimization phase during planning which would handle this case as well as many others.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            superdupershant Shant Hovsepian
            superdupershant Shant Hovsepian
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment