Description
When inferred types in the same field during finding competible DataType are IntegralType and DecimalType but DecimalType is not capable of the given IntegralType, JSON data source simply parses this as StringType.
This can be observed when floatAsBigDecimal is enabled.
def mixedIntegerAndDoubleRecords: RDD[String] = sqlContext.sparkContext.parallelize( """{"a": 3, "b": 1.1}""" :: """{"a": 3.1, "b": 1}""" :: Nil) val jsonDF = sqlContext.read .option("floatAsBigDecimal", "true") .json(mixedIntegerAndDoubleRecords) .printSchema()
produces below:
root |-- a: string (nullable = true) |-- b: string (nullable = true)
When floatAsBigDecimal is disabled.
def mixedIntegerAndDoubleRecords: RDD[String] = sqlContext.sparkContext.parallelize( """{"a": 3, "b": 1.1}""" :: """{"a": 3.1, "b": 1}""" :: Nil) val jsonDF = sqlContext.read .option("floatAsBigDecimal", "false") .json(mixedIntegerAndDoubleRecords) .printSchema()
produces below correctly:
root |-- a: double (nullable = true) |-- b: double (nullable = true)