Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3936

RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.23.0
    • 1.24.0
    • None

    Description

      ... for dialects with SqlConformance.isHavingAlias=false

      Very, very similar to CALCITE-3593.

      Reproducing test case in RelToSqlConverter:

      @Test public void testHavingAlias2() {
        final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as gross_weight\n" +
            " from \"product\"\n" +
            " group by \"product_id\"\n" +
            " having sum(\"product\".\"gross_weight\") < 200";
        final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" +
            "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" +
            "FROM foodmart.product\n" +
            "GROUP BY product_id\n" +
            "HAVING SUM(product.gross_weight) < 200) AS t1"
            // (or) "HAVING gross_weight < 200) AS t1"
            // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1"
      
            // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1"
            // which on BigQuery gives you an error about aggregating aggregates
            ;
        sql(query).withBigQuery().ok(expected);
      }
      

      In that one, the pattern was Project/Filter/Aggregate, here it is Filter/Aggregate/Project. In 3593, the project created a new alias, which got added to the same SELECT clause and caused the ambiguity. Here, the aggregate creates an alias, but the filter will write a HAVING clause using the aliases from before the Aggregate, and that will cause the SQL engine to think that the filter is on the aggregate field, rather than on the underlying field.

      Note that this is less an absurdly unlikely occurrence than it might seem because when Calcite's default aliasing kicks in and everything gets the name "$f6", "$f4", etc, so chances of a collision are higher if you have multiply nested selects with default aliases.

      Potential fixes:

      1. force a subselect, as was done for 3593.
      2. Force the expression in the HAVING to be fully aliased by table (works at least in BigQuery, where I tested)
      3. Write the HAVING expression in terms of the aliases from the aggregate, rather than what's coming from the aggregate (also works on BigQuery)

      Attachments

        Issue Links

          Activity

            People

              julianhyde Julian Hyde
              swtalbot Steven Talbot
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h