Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-258 Use skip scan when SELECT DISTINCT on leading row key column(s)
  3. PHOENIX-2989

Allow DistinctPrefixFilter optimization when HAVING clause only reference COUNT(DISTINCT)

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      The DistinctPrefixFilter optimization can still be used if a HAVING clause only references COUNT(DISTINCT) expressions. One way to detect this is to collect a Set<ParseNode> using a visitor for the SELECT and HAVING which only collects COUNT(DISTINCT) expressions. This set will then be used as the GROUP BY nodes if there's no existing GROUP BY.

      The check for whether or not to add the filter can then change to something like this:

          if (... &&
          ( context.getAggregationManager().isEmpty() ||
            ( plan.getGroupBy().isUngroupedAggregate() &&
              plan.getGroupBy().getKeyExpressions().size() ==  
              context.getAggregationManager().getAggregators().getAggregatorCount() ) ) )
      

      That way, it'll only add the filter if all expressions pulled in as a GROUP BY expression (only the count distinct ones) account for all of the aggregators.

      Attachments

        1. 2989-orderby.txt
          4 kB
          Lars Hofhansl
        2. 2989-orderby-v2.txt
          5 kB
          Lars Hofhansl
        3. 2989-orderby-v3.txt
          6 kB
          Lars Hofhansl

        Activity

          People

            Unassigned Unassigned
            jamestaylor James R. Taylor
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: