[SPARK-42998] Fix DataFrame.collect with null struct. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.4.0
Fix Version/s: 3.4.0
Component/s: Connect
Labels:
None

Description

In Spark Connect:

>>> df = spark.sql("values (1, struct('a' as x)), (null, null) as t(a, b)")
>>> df.show()
+----+----+
|   a|   b|
+----+----+
|   1| {a}|
|null|null|
+----+----+

>>> df.collect()
[Row(a=1, b=Row(x='a')), Row(a=None, b=<Row()>)]

whereas PySpark:

>>> df.collect()
[Row(a=1, b=Row(x='a')), Row(a=None, b=None)]

Attachments

Activity

People

Assignee:: Takuya Ueshin

Reporter:: Takuya Ueshin

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 31/Mar/23 22:30

Updated:: 01/Apr/23 01:35

Resolved:: 01/Apr/23 01:35