Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5081

Group keys of Aggregate are wrongly changed during decorrelation

    XMLWordPrintableJSON

Details

    Description

      Bug in RelDecorrelator.decorrelate(Aggregate, boolean) is changing the group keys. We have following tables

      table1:

          "ID", SqlTypeName.VARCHAR,
          "K509", SqlTypeName.BIGINT,
          "K505", SqlTypeName.BIGINT
      

       table2:

            "K14507", SqlTypeName.VARCHAR,
            "K14506", SqlTypeName.BIGINT,
            "K14509", SqlTypeName.BIGINT
      

      A source relnode representing the following SQL:

          SELECT
         `K505`,
         SUM(`test_total`) AS ` $ f1`
      FROM
         (
            SELECT
               ` $ cor0`.`K509`,
               ` $ cor0`.`K505`,
               ` $ cor0`.` $ f0` AS `test_total`
            FROM
               `table1` AS ` $ cor0`,
               LATERAL (
               SELECT
                  SUM(`K14506`) AS ` $ f0`
               FROM
                  `table2`
               WHERE
                  `K14507` = ` $ cor0`.`ID`) AS `t1`
         )
         AS `t2`
      WHERE
         `t2`.`K505` = 1000
      GROUP BY
         `K505`
      

       

         After decorelation, the decorrelated relNode represents the following incorrect SQL:

         SELECT
         `t`.`K509`,
         SUM(`t1`.` $ f1`) AS ` $ f1`
      FROM
         (
            SELECT
               *
            FROM
               `table1`
            WHERE
               `K505` = 1000
         )
         AS `t`
         LEFT JOIN
            (
               SELECT
                  `K14507`,
                  SUM(`K14506`) AS ` $ f1`
               FROM
                  `table2`
               GROUP BY
                  `K14507`
            )
            AS `t1`
            ON `t`.`ID` = `t1`.`K14507`
      GROUP BY
         `t`.`K509`
      

         it changed the group key.
         
         it looks like the logic in RelDecorrelator.decorrelate(Aggregate, boolean) is always picking the 0th index group key.

         We built the relNode using the following logic(used sqls above to explain the problem)

       
      val builder = RelBuilder.create(createConfig())
      val v = Holder.of[RexCorrelVariable](null)
      val relNode = builder
        .scan("table1")
        .variable(v)
        .scan("table2")
        .filter(builder.equals(builder.field(0), builder.field(v.get(), "ID")))
        .project(builder.field("K14506"))
        .aggregate(builder.groupKey(), builder.sum(builder.field(0)))
        .project(builder.alias(builder.field(0), "test_total"))
        .correlate(JoinRelType.LEFT, v.get().id, builder.field(2, 0, "ID"))
        .project(builder.field(1), builder.field(2), builder.field(3))
        .filter(builder.equals(builder.field(1), builder.literal(1000)))
        .aggregate(builder.groupKey("K505"), builder.sum(builder.field(2)))
        .build()
      

      Attachments

        Issue Links

          Activity

            People

              libenchao Benchao Li
              ShivenDvrk Shiven Dvrk
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m