Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41281 Feature parity: SparkSession API in Spark Connect
  3. SPARK-41810

SparkSession.createDataFrame does not respect the column names in the dictionary

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Connect
    • None

    Description

      Failed example:
          with tempfile.TemporaryDirectory() as d:
              # Write a DataFrame into a JSON file
              spark.createDataFrame(
                  [{"age": 100, "name": "Hyukjin Kwon"}]
              ).write.mode("overwrite").format("json").save(d)
      
              # Read the JSON file as a DataFrame.
              spark.read.format('json').load(d).show()
      Expected:
          +---+------------+
          |age|        name|
          +---+------------+
          |100|Hyukjin Kwon|
          +---+------------+
      Got:
          +---+------------+
          | _1|          _2|
          +---+------------+
          |100|Hyukjin Kwon|
          +---+------------+
      

      Attachments

        Issue Links

          Activity

            People

              gurwls223 Hyukjin Kwon
              gurwls223 Hyukjin Kwon
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: