Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20329

Resolution error when HAVING clause uses GROUP BY expression that involves implicit type coercion

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0, 2.3.0
    • SQL
    • None

    Description

      The following example runs without error on Spark 2.0.x and 2.1.x but fails in the current Spark master:

      create temporary view foo (a, b) as values (cast(1 as bigint), 2), (cast(3 as bigint), 4);
      
      select a + b from foo group by a + b having (a + b) > 1 
      

      The error is

      Error in SQL statement: AnalysisException: cannot resolve '`a`' given input columns: [(a + CAST(b AS BIGINT))]; line 1 pos 45;
      'Filter (('a + 'b) > 1)
      +- Aggregate [(a#249243L + cast(b#249244 as bigint))], [(a#249243L + cast(b#249244 as bigint)) AS (a + CAST(b AS BIGINT))#249246L]
         +- SubqueryAlias foo
            +- Project [col1#249241L AS a#249243L, col2#249242 AS b#249244]
               +- LocalRelation [col1#249241L, col2#249242]
      

      I think what's happening here is that the implicit cast is breaking things: if we change the types so that both columns are integers then the analysis error disappears. Similarly, adding explicit casts, as in

      select a + cast(b as bigint) from foo group by a + cast(b as bigint) having (a + cast(b as bigint)) > 1 
      

      works so I'm pretty sure that the resolution problem is being introduced when the casts are automatically added by the type coercion rule.

      Attachments

        Issue Links

          Activity

            People

              hvanhovell Herman van Hövell
              joshrosen Josh Rosen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: