Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20555

Incorrect handling of Oracle's decimal types via JDBC



    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 2.1.2, 2.2.0
    • SQL
    • None


      When querying an Oracle database, Spark maps some Oracle numeric data types to incorrect Catalyst data types:
      1. DECIMAL(1) becomes BooleanType
      In Orcale, a DECIMAL(1) can have values from -9 to 9.
      In Spark now, values larger than 1 become the boolean value true.
      2. DECIMAL(3,2) becomes IntegerType
      In Oracle, a DECIMAL(2) can have values like 1.23
      In Spark now, digits after the decimal point are dropped.
      3. DECIMAL(10) becomes IntegerType
      In Oracle, a DECIMAL(10) can have the value 9999999999 (ten nines), which is more than 2^31
      Spark throws an exception: "java.sql.SQLException: Numeric Overflow"

      I think the best solution is to always keep Oracle's decimal types. (In theory we could introduce a FloatType in some case of #2, and fix #3 by only introducing IntegerType for DECIMAL(9). But in my opinion, that would end up complicated and error-prone.)

      Note: I think the above problems were introduced as part of https://github.com/apache/spark/pull/14377
      The main purpose of that PR seems to be converting Spark types to correct Oracle types, and that part seems good to me. But it also adds the inverse conversions. As it turns out in the above examples, that is not possible.




            gfeher Gabor Feher
            gfeher Gabor Feher
            2 Vote for this issue
            5 Start watching this issue