Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.2.1
-
None
Description
With the upgrade in Avro version to 1.9.0, for schema evolution Avro added https://issues.apache.org/jira/browse/AVRO-2035(enable validation of default values in schemas by default) which is causing regressions when user upgrades their Spark verion.
Repro code:
import org.apache.spark.sql.avro.functions._ val avroTypeStruct = s""" |{ | "type": "record", | "name": "struct", | "fields": [ | {"name": "id", "type": "long", "default": null} | ] |}""".stripMargin val df = spark.range(10).select(struct('id).as("struct")) val avroStructDF = df.select(to_avro('struct, avroTypeStruct).as("avro")) avroStructDF.select(from_avro('avro, avroTypeStruct)).show()
Hive mitigated it by disabling this feature altogether in https://issues.apache.org/jira/browse/HIVE-24797
Spark-Hive integration also imported the above changes in https://issues.apache.org/jira/browse/SPARK-34512
Can we have a fix for all the senarios?