Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20329

Resolution error when HAVING clause uses GROUP BY expression that involves implicit type coercion


    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.0, 2.3.0
    • Component/s: SQL
    • Labels:
    • Target Version/s:


      The following example runs without error on Spark 2.0.x and 2.1.x but fails in the current Spark master:

      create temporary view foo (a, b) as values (cast(1 as bigint), 2), (cast(3 as bigint), 4);
      select a + b from foo group by a + b having (a + b) > 1 

      The error is

      Error in SQL statement: AnalysisException: cannot resolve '`a`' given input columns: [(a + CAST(b AS BIGINT))]; line 1 pos 45;
      'Filter (('a + 'b) > 1)
      +- Aggregate [(a#249243L + cast(b#249244 as bigint))], [(a#249243L + cast(b#249244 as bigint)) AS (a + CAST(b AS BIGINT))#249246L]
         +- SubqueryAlias foo
            +- Project [col1#249241L AS a#249243L, col2#249242 AS b#249244]
               +- LocalRelation [col1#249241L, col2#249242]

      I think what's happening here is that the implicit cast is breaking things: if we change the types so that both columns are integers then the analysis error disappears. Similarly, adding explicit casts, as in

      select a + cast(b as bigint) from foo group by a + cast(b as bigint) having (a + cast(b as bigint)) > 1 

      works so I'm pretty sure that the resolution problem is being introduced when the casts are automatically added by the type coercion rule.


          Issue Links



              • Assignee:
                hvanhovell Herman van Hovell
                joshrosen Josh Rosen
              • Votes:
                0 Vote for this issue
                3 Start watching this issue


                • Created: