[SPARK-38839] Creating a struct with a float inside - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Duplicate
Affects Version/s: 3.2.1
Fix Version/s: None
Component/s: PySpark
Labels:
None

Description

When creating a dataframe using createDataFrame that contains a float inside a struct, the float is set to null. This only happens if using a list of dictionaries as data type, if I use a list of Rows it works fine:

data = [{"MyStruct": {"MyInt": 10, "MyFloat": 10.1}, "MyFloat": 10.1}]

spark.createDataFrame(data).show()
# +-------+------------------------------+
# |MyFloat|MyStruct                      |
# +-------+------------------------------+
# |10.1   |{MyInt -> 10, MyFloat -> null}|
# +-------+------------------------------+ 


data = [Row(MyStruct=Row(MyInt=10, MyFloat=10.1), MyFloat=10.1)]

spark.createDataFrame(data).show()
# +-------+------------------------------+
# |MyFloat|MyStruct                      |
# +-------+------------------------------+
# |10.1   |{MyInt -> 10, MyFloat -> 10.1}|
# +-------+------------------------------+

Note MyFloat inside MyStruct is set to null in the first example. Interestingly enough, when I do the same with Row, or if I specify the schema, then this does not happen (second example).

Attachments

Issue Links

duplicates

SPARK-35929 Schema inference of nested structs defaults to map

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Daniel deCordoba

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 08/Apr/22 20:58

Updated:: 12/Dec/22 18:10

Resolved:: 11/Apr/22 00:15