Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46251

Spark 3.3.3 tuple encoders built using Encoders.tuple do not correctly cast null into None for Option values

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0
    • None
    • SQL
    • None

    Description

      In Spark 3.3.2 encoders created using Encoders.tuple(encoder1, encoder2, ..) correctly handle casting null into None when the target type is an Option. 

      In Spark 3.3.3, this behaviour has changed and the Option value comes through as null which is likely to cause a NullPointerException for most Scala code that operates on the Option. The change seems to be related to the following commit:

      https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a

      I have made a reproduction with a couple of examples in a public Github repo here:

      https://github.com/q-willboulter/spark-tuple-encoders-bug 

      The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using Encoders.tuple(leftEncoder, rightEncoder).

      If the entire tuple encoder Encoder[(Left, Option[Right]]) is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside ExpressionEncoder.scala

      def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              willbo Will Boulter
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: