Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42899

DataFrame.to(schema) fails when it contains non-nullable nested field in nullable field

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.1
    • SQL
    • None

    Description

      DataFrame.to(schema) fails when it contains non-nullable nested field in nullable field:

      scala> val df = spark.sql("VALUES (1, STRUCT(1 as i)), (NULL, NULL) as t(a, b)")
      df: org.apache.spark.sql.DataFrame = [a: int, b: struct<i: int>]
      scala> df.printSchema()
      root
       |-- a: integer (nullable = true)
       |-- b: struct (nullable = true)
       |    |-- i: integer (nullable = false)
      
      scala> df.to(df.schema)
      org.apache.spark.sql.AnalysisException: [NULLABLE_COLUMN_OR_FIELD] Column or field `b`.`i` is nullable while it's required to be non-nullable.
      

      Attachments

        Activity

          People

            ueshin Takuya Ueshin
            ueshin Takuya Ueshin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: