Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34003

Rule conflicts between PaddingAndLengthCheckForCharVarchar and ResolveAggregateFunctions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.1.0
    • 3.1.1
    • SQL
    • None

    Description

      ResolveAggregateFunctions is a hacky rule and it calls `executeSameContext` to generate a `resolved agg` to determine which unresolved sort attribute should be pushed into the agg. However, after we add the PaddingAndLengthCheckForCharVarchar rule which will rewrite the query output, thus, the `resolved agg` cannot match original attributes anymore.

      It causes some dissociative sort attribute to be pushed in and fails the query

      [info]   Failed to analyze query: org.apache.spark.sql.AnalysisException: expression 'testcat.t1.`v`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;
      [info]   Project [v#14, sum(i)#11L]
      [info]   +- Sort [aggOrder#12 ASC NULLS FIRST], true
      [info]      +- !Aggregate [v#14], [v#14, sum(cast(i#7 as bigint)) AS sum(i)#11L, v#13 AS aggOrder#12]
      [info]         +- SubqueryAlias testcat.t1
      [info]            +- Project [if ((length(v#6) <= 3)) v#6 else if ((length(rtrim(v#6, None)) > 3)) cast(raise_error(concat(input string of length , cast(length(v#6) as string),  exceeds varchar type length limitation: 3)) as string) else rpad(rtrim(v#6, None), 3,  ) AS v#14, i#7]
      [info]               +- RelationV2[v#6, i#7, index#15, _partition#16] testcat.t1
      [info]
      [info]   Project [v#14, sum(i)#11L]
      [info]   +- Sort [aggOrder#12 ASC NULLS FIRST], true
      [info]      +- !Aggregate [v#14], [v#14, sum(cast(i#7 as bigint)) AS sum(i)#11L, v#13 AS aggOrder#12]
      [info]         +- SubqueryAlias testcat.t1
      [info]            +- Project [if ((length(v#6) <= 3)) v#6 else if ((length(rtrim(v#6, None)) > 3)) cast(raise_error(concat(input string of length , cast(length(v#6) as string),  exceeds varchar type length limitation: 3)) as string) else rpad(rtrim(v#6, None), 3,  ) AS v#14, i#7]
      [info]               +- RelationV2[v#6, i#7, index#15, _partition#16] testcat.t1
      

      Attachments

        Issue Links

          Activity

            People

              Qin Yao Kent Yao 2
              Qin Yao Kent Yao 2
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: