Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18284

Scheme of DataFrame generated from RDD is different between master and 2.0

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.0, 2.2.0
    • 2.2.0
    • SQL
    • None

    Description

      When the following program is executed, a schema of dataframe is different among master, branch 2.0, and branch 2.1. The result should be false.

      val df = sparkContext.parallelize(1 to 8, 1).toDF()
      df.printSchema
      df.filter("value > 4").count
      
      === master ===
      root
       |-- value: integer (nullable = true)
      
      === branch 2.1 ===
      root
       |-- value: integer (nullable = true)
      
      === branch 2.0 ===
      root
       |-- value: integer (nullable = false)
      

      Attachments

        Issue Links

          Activity

            People

              kiszk Kazuaki Ishizaki
              kiszk Kazuaki Ishizaki
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: