Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33136

Handling nullability for complex types is broken during resolution of V2 write command

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 3.0.1, 3.1.0
    • Fix Version/s: 2.4.8, 3.0.2, 3.1.0
    • Component/s: SQL
    • Labels:
      None

      Description

      I figured out Spark 3.x cannot write to complex type with nullable if matching column type in DataFrame is non-nullable.

      For example, 

      case class StructData(a: String, b: Int)
      
      case class Data(col_b: Boolean, col_i: Int, col_l: Long, col_f: Float, col_d: Double, col_s: String, col_fi: Array[Byte], col_bi: Array[Byte], col_de: Double, col_st: StructData, col_li: Seq[String], col_ma: Map[Int, String])

      `col_st.b` would be non-nullable in DataFrame, which should not matter when we insert from DataFrame to the table which has `col_st.b` as nullable. (non-nullable to nullable should be possible)

      This looks to be broken in V2 write command.

       

        Attachments

          Activity

            People

            • Assignee:
              kabhwan Jungtaek Lim
              Reporter:
              kabhwan Jungtaek Lim
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: