Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21332

Incorrect result type inferred for some decimal expressions

    XMLWordPrintableJSON

Details

    Description

      Decimal expressions do not always follow the type inference rules explained in DecimalPrecision.scala. An incorrect result type is produced when the expressions contains more than 2 decimals.

      For example:
      spark-sql> CREATE TABLE Decimals(decimal_26_6 DECIMAL(26,6));
      ...
      spark-sql> describe decimals;
      ...
      decimal_26_6 decimal(26,6) NULL
      spark-sql> explain select decimal_26_6 * decimal_26_6 from decimals;
      ...
      == Physical Plan ==
      *Project CheckOverflow((decimal_26_6#99 * decimal_26_6#99), DecimalType(38,12)) AS (decimal_26_6 * decimal_26_6)#100
      +- HiveTableScan decimal_26_6#99, MetastoreRelation default, decimals

      However:
      spark-sql> explain select decimal_26_6 * decimal_26_6 * decimal_26_6 from decimals;
      ...
      == Physical Plan ==
      *Project CheckOverflow((cast(CheckOverflow((decimal_26_6#104 * decimal_26_6#104), DecimalType(38,12)) as decimal(26,6)) * decimal_26_6#104), DecimalType(38,12)) AS ((decimal_26_6 * decimal_26_6) * decimal_26_6)#105
      +- HiveTableScan decimal_26_6#104, MetastoreRelation default, decimals

      The expected result type is DecimalType(38,18).

      In Hive 1.1.0:
      hive> explain select decimal_26_6 * decimal_26_6 from decimals;
      OK
      STAGE DEPENDENCIES:
      Stage-0 is a root stage

      STAGE PLANS:
      Stage: Stage-0
      Fetch Operator
      limit: -1
      Processor Tree:
      TableScan
      alias: decimals
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      Select Operator
      expressions: (decimal_26_6 * decimal_26_6) (type: decimal(38,12))
      outputColumnNames: _col0
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      ListSink

      Time taken: 0.772 seconds, Fetched: 17 row(s)
      hive> explain select decimal_26_6 * decimal_26_6 * decimal_26_6 from decimals;
      OK
      STAGE DEPENDENCIES:
      Stage-0 is a root stage

      STAGE PLANS:
      Stage: Stage-0
      Fetch Operator
      limit: -1
      Processor Tree:
      TableScan
      alias: decimals
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      Select Operator
      expressions: ((decimal_26_6 * decimal_26_6) * decimal_26_6) (type: decimal(38,18))
      outputColumnNames: _col0
      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
      ListSink

      Time taken: 0.064 seconds, Fetched: 17 row(s)

      Attachments

        Issue Links

          Activity

            People

              aokolnychyi Anton Okolnychyi
              ashkapsky Alexander Shkapsky
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: