Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38060

Inconsistent behavior from JSON option allowNonNumericNumbers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.2.0
    • 3.3.0
    • SQL
    • None
    • Running Spark 3.2.0 in local mode on Ubuntu 20.04.3 LTS

    Description

      The behavior of the JSON option allowNonNumericNumbers is not consistent:

      1. Some NaN and Infinity values are still parsed when the option is set to false

      2. Some values are parsed differently depending on whether they are quoted or not (see results for positive and negative Infinity)

      Input data

      { "number": "NaN" }
      { "number": NaN }
      { "number": "+INF" }
      { "number": +INF }
      { "number": "-INF" }
      { "number": -INF }
      { "number": "INF" }
      { "number": INF }
      { "number": Infinity }
      { "number": +Infinity }
      { "number": -Infinity }
      { "number": "Infinity" }
      { "number": "+Infinity" }
      { "number": "-Infinity" }
      

      Setup

      import org.apache.spark.sql.types._
      
      val schema = StructType(Seq(StructField("number", DataTypes.FloatType, false))) 

      allowNonNumericNumbers = false

      spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "false").json("nan_valid.json")
      
      df.show
      
      +---------+
      |   number|
      +---------+
      |      NaN|
      |     null|
      |     null|
      |     null|
      |     null|
      |     null|
      |     null|
      |     null|
      |     null|
      |     null|
      |     null|
      | Infinity|
      |     null|
      |-Infinity|
      +---------+ 

      allowNonNumericNumbers = true

      val df = spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "true").json("nan_valid.json") 
      
      df.show
      
      +---------+
      |   number|
      +---------+
      |      NaN|
      |      NaN|
      |     null|
      | Infinity|
      |     null|
      |-Infinity|
      |     null|
      |     null|
      | Infinity|
      | Infinity|
      |-Infinity|
      | Infinity|
      |     null|
      |-Infinity|
      +---------+

      Attachments

        Activity

          People

            andygrove Andy Grove
            andygrove Andy Grove
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: