[SPARK-42899] DataFrame.to(schema) fails when it contains non-nullable nested field in nullable field - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.4.0
Fix Version/s: 3.4.1
Component/s: SQL
Labels:
None

Description

DataFrame.to(schema) fails when it contains non-nullable nested field in nullable field:

scala> val df = spark.sql("VALUES (1, STRUCT(1 as i)), (NULL, NULL) as t(a, b)")
df: org.apache.spark.sql.DataFrame = [a: int, b: struct<i: int>]
scala> df.printSchema()
root
 |-- a: integer (nullable = true)
 |-- b: struct (nullable = true)
 |    |-- i: integer (nullable = false)

scala> df.to(df.schema)
org.apache.spark.sql.AnalysisException: [NULLABLE_COLUMN_OR_FIELD] Column or field `b`.`i` is nullable while it's required to be non-nullable.

Attachments

Activity

People

Assignee:: Takuya Ueshin

Reporter:: Takuya Ueshin

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 22/Mar/23 21:53

Updated:: 30/Oct/24 09:48

Resolved:: 23/Mar/23 02:16