Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17158

Improve error message for numeric literal parsing

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.0
    • 2.0.1, 2.1.0
    • SQL
    • None

    Description

      Spark currently gives confusing and inconsistent error messages for numeric literals. For example:
      scala> sql("select 123456Y")
      org.apache.spark.sql.catalyst.parser.ParseException:
      Value out of range. Value:"123456" Radix:10(line 1, pos 7)

      == SQL ==
      select 123456Y
      -------^^^
      scala> sql("select 123456S")
      org.apache.spark.sql.catalyst.parser.ParseException:
      Value out of range. Value:"123456" Radix:10(line 1, pos 7)

      == SQL ==
      select 123456S
      -------^^^
      scala> sql("select 12345623434523434564565L")
      org.apache.spark.sql.catalyst.parser.ParseException:
      For input string: "12345623434523434564565"(line 1, pos 7)

      == SQL ==
      select 12345623434523434564565L
      -------^^^
      The problem is that we are relying on JDK's implementations for parsing, and those functions throw different error messages. This code can be found in AstBuilder.numericLiteral function.
      The proposal is that instead of using `_.toByte` to turn a string into a byte, we always turn the numeric literal string into a BigDecimal, and then we validate the range before turning it into a numeric value. This way, we have more control over the data.
      If BigDecimal fails to parse the number, we should throw a better exception than "For input string ...".

      Attachments

        Activity

          People

            vssrinath Srinath
            vssrinath Srinath
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: