Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4636

Wrong RelCollation if a field in GROUP BY appears in ORDER BY of WITHIN GROUP clause

    XMLWordPrintableJSON

Details

    Description

      If we run this query:

      select regionkey, collect(regionkey)
      within group (order by regionkey)
      from (SELECT * FROM (VALUES('blah','blah','blah',1,1)) as TBL(f1,f2,f3,nationkey,regionkey))
      group by regionkey
      

       

      We see that we are ordering by the same field as the group by field (regionkey). Now the query may be non-sensicle as there is no point of collecting the same field with in the same group, but syntactically its a valid SQL and should produce correct plan, but it generates this plan instead:

      LogicalAggregate(group=[{0}], EXPR$1=[COLLECT($0) WITHIN GROUP ([1])])
        LogicalProject(REGIONKEY=[$4], F1=[$0])
          LogicalProject(F1=[$0], F2=[$1], F3=[$2], NATIONKEY=[$3], REGIONKEY=[$4])
            LogicalValues(tuples=[[{ 'blah', 'blah', 'blah', 1, 1 }]])
      

       

      If we run the following good query, it generates correct plan:

       

      select nationkey, collect(regionkey)
      within group (order by regionkey)
      from (SELECT * FROM (VALUES('blah','blah','blah',1,1)) as TBL(f1,f2,f3,nationkey,regionkey))
      group by regionkey

       

      Generated plan:

      LogicalAggregate(group=[{0}], EXPR$1=[COLLECT($1) WITHIN GROUP ([1])])
        LogicalProject(NATIONKEY=[$3], REGIONKEY=[$4])
          LogicalProject(F1=[$0], F2=[$1], F3=[$2], NATIONKEY=[$3], REGIONKEY=[$4])
            LogicalValues(tuples=[[{ 'blah', 'blah', 'blah', 1, 1 }]])

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Rafay Rafay A
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m