Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38839

Creating a struct with a float inside

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 3.2.1
    • None
    • PySpark
    • None

    Description

      When creating a dataframe using createDataFrame that contains a float inside a struct, the float is set to null. This only happens if using a list of dictionaries as data type, if I use a list of Rows it works fine:

      data = [{"MyStruct": {"MyInt": 10, "MyFloat": 10.1}, "MyFloat": 10.1}]
      
      spark.createDataFrame(data).show()
      # +-------+------------------------------+
      # |MyFloat|MyStruct                      |
      # +-------+------------------------------+
      # |10.1   |{MyInt -> 10, MyFloat -> null}|
      # +-------+------------------------------+ 
      
      
      data = [Row(MyStruct=Row(MyInt=10, MyFloat=10.1), MyFloat=10.1)]
      
      spark.createDataFrame(data).show()
      # +-------+------------------------------+
      # |MyFloat|MyStruct                      |
      # +-------+------------------------------+
      # |10.1   |{MyInt -> 10, MyFloat -> 10.1}|
      # +-------+------------------------------+ 

      Note MyFloat inside MyStruct is set to null in the first example. Interestingly enough, when I do the same with Row, or if I specify the schema, then this does not happen (second example).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              decordoba Daniel deCordoba
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: