Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40903

Avoid reordering decimal Add for canonicalization

    XMLWordPrintableJSON

Details

    • Test
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • SQL
    • None

    Description

      Avoid reordering Add for canonicalizing if it is decimal type.
      Expressions are canonicalized for comparisons and explanations. For non-decimal Add expression, the order can be sorted by hashcode, and the result is supposed to be the same.
      However, for Add expression of Decimal type, the behavior is different: Given decimal (p1, s1) and another decimal (p2, s2), the result integral part is `max(p1-s1, p2-s2) +1`, the result decimal part is `max(s1, s2)`. Thus the result data type is `(max(p1-s1, p2-s2) +1 + max(s1, s2), max(s1, s2))`.
      Thus the order matters:

      • For `(decimal(12,5) + decimal(12,6)) + decimal(3, 2)`, the first add `decimal(12,5) + decimal(12,6)` results in `decimal(14, 6)`, and then `decimal(14, 6) + decimal(3, 2)` ¬†results in `decimal(15, 6)`
      • For `(decimal(12, 6) + decimal(3,2)) + decimal(12, 5)`, the first add `decimal(12, 6) + decimal(3,2)` results in `decimal(13, 6)`, and then `decimal(13, 6) + decimal(12, 5)` results in `decimal(14, 6)`

      In the following query:
      ```
      create table foo(a decimal(12, 5), b decimal(12, 6)) using orc
      select sum(coalesce(a+b+ 1.75, a)) from foo
      ```
      At first `coalesce(a+b+ 1.75, a)` is resolved as `coalesce(a+b+ 1.75, cast(a as decimal(15, 6))`. In the canonicalized version, the expression becomes `coalesce(1.75+b+a, cast(a as decimal(15, 6))`. As explained above, `1.75+b+a` is of decimal(14, 6), which is different from  `cast(a as decimal(15, 6)`. Thus the following error will happen:

      java.lang.IllegalArgumentException: requirement failed: All input types must be the same except nullable, containsNull, valueContainsNull flags. The input types found are
      	DecimalType(14,6)
      	DecimalType(15,6)
      	at scala.Predef$.require(Predef.scala:281)
      	at org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck(Expression.scala:1149)
      	at org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck$(Expression.scala:1143) 

      Attachments

        Activity

          People

            Gengliang.Wang Gengliang Wang
            Gengliang.Wang Gengliang Wang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: