• Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None


      An optimization that would really benefit the SELECT COUNT(DISTINCT pkCol) case: if there's only a single COUNT(DISTINCT pkCol) and the GroupBy ends up being order preserving, you can replace the COUNT(DISTINCT pkCol) with a COUNT(pkCol) in the SELECT, HAVING, and ORDER BY clauses. That'll prevent the DistinctValueWithCountServerAggregator from being used which keeps a Map of all unique values and instead just keep a single overall count, which is all we need thanks to your DistinctPrefixFilter.

      A few considerations in the implementation:

      • Pass through select in the call to groupBy.compile() in QueryCompiler and change the return type to return a new select (as the SELECT, HAVING, and ORDER BY may have been rewritten). Probably easiest if the GroupBy object is just mutated in place.
      • Within the groupBy.compile() call, use a visitor on the SELECT, HAVING and ORDER BY clauses to do the rewriting. You can do that by deriving a class from ParseNodeRewriter, overriding the visitLeave(final FunctionParseNode node, List<ParseNode> nodes) method to return a new COUNT parse node with the nodes passed in as children if node equals the DistinctCountParseNode that you replaced in the select statement.
      • The compilation of the HAVING clause should be moved after the call to groupBy compile in QueryCompiler, like this since it may have been rewritten in the groupBy.compile call:
                select = groupBy.compile(context, select, innerPlanTupleProjector);
                Expression having = HavingCompiler.compile(context, select, groupBy);




            Unassigned Unassigned
            jamestaylor James R. Taylor
            0 Vote for this issue
            4 Start watching this issue