Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21332

Incorrect result type inferred for some decimal expressions

    Details

      Description

      Decimal expressions do not always follow the type inference rules explained in DecimalPrecision.scala. An incorrect result type is produced when the expressions contains more than 2 decimals.

      For example:
      spark-sql> CREATE TABLE Decimals(decimal_26_6 DECIMAL(26,6));
      ...
      spark-sql> describe decimals;
      ...
      decimal_26_6 decimal(26,6) NULL
      spark-sql> explain select decimal_26_6 * decimal_26_6 from decimals;
      ...
      == Physical Plan ==
      *Project CheckOverflow((decimal_26_6#99 * decimal_26_6#99), DecimalType(38,12)) AS (decimal_26_6 * decimal_26_6)#100
      +- HiveTableScan decimal_26_6#99, MetastoreRelation default, decimals

      However:
      spark-sql> explain select decimal_26_6 * decimal_26_6 * decimal_26_6 from decimals;
      ...
      == Physical Plan ==
      *Project CheckOverflow((cast(CheckOverflow((decimal_26_6#104 * decimal_26_6#104), DecimalType(38,12)) as decimal(26,6)) * decimal_26_6#104), DecimalType(38,12)) AS ((decimal_26_6 * decimal_26_6) * decimal_26_6)#105
      +- HiveTableScan decimal_26_6#104, MetastoreRelation default, decimals

      The expected result type is DecimalType(38,18).

      In Hive 1.1.0:
      hive> explain select decimal_26_6 * decimal_26_6 from decimals;
      OK
      STAGE DEPENDENCIES:
      Stage-0 is a root stage

      STAGE PLANS:
      Stage: Stage-0
      Fetch Operator
      limit: -1
      Processor Tree:
      TableScan
      alias: decimals
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      Select Operator
      expressions: (decimal_26_6 * decimal_26_6) (type: decimal(38,12))
      outputColumnNames: _col0
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      ListSink

      Time taken: 0.772 seconds, Fetched: 17 row(s)
      hive> explain select decimal_26_6 * decimal_26_6 * decimal_26_6 from decimals;
      OK
      STAGE DEPENDENCIES:
      Stage-0 is a root stage

      STAGE PLANS:
      Stage: Stage-0
      Fetch Operator
      limit: -1
      Processor Tree:
      TableScan
      alias: decimals
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      Select Operator
      expressions: ((decimal_26_6 * decimal_26_6) * decimal_26_6) (type: decimal(38,18))
      outputColumnNames: _col0
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      ListSink

      Time taken: 0.064 seconds, Fetched: 17 row(s)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                aokolnychyi Anton Okolnychyi
                Reporter:
                ashkapsky Alexander Shkapsky
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: