Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-258 Use skip scan when SELECT DISTINCT on leading row key column(s)
  3. PHOENIX-2989

Allow DistinctPrefixFilter optimization when HAVING clause only reference COUNT(DISTINCT)

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      The DistinctPrefixFilter optimization can still be used if a HAVING clause only references COUNT(DISTINCT) expressions. One way to detect this is to collect a Set<ParseNode> using a visitor for the SELECT and HAVING which only collects COUNT(DISTINCT) expressions. This set will then be used as the GROUP BY nodes if there's no existing GROUP BY.

      The check for whether or not to add the filter can then change to something like this:

          if (... &&
          ( context.getAggregationManager().isEmpty() ||
            ( plan.getGroupBy().isUngroupedAggregate() &&
              plan.getGroupBy().getKeyExpressions().size() ==  
              context.getAggregationManager().getAggregators().getAggregatorCount() ) ) )
      

      That way, it'll only add the filter if all expressions pulled in as a GROUP BY expression (only the count distinct ones) account for all of the aggregators.

      Attachments

        1. 2989-orderby.txt
          4 kB
          Lars Hofhansl
        2. 2989-orderby-v2.txt
          5 kB
          Lars Hofhansl
        3. 2989-orderby-v3.txt
          6 kB
          Lars Hofhansl

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            jamestaylor James R. Taylor

            Dates

              Created:
              Updated:

              Slack

                Issue deployment