Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38042

Encoder cannot be found when a tuple component is a type alias for an Array

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.2, 3.2.0
    • 3.3.0, 3.2.2
    • SQL
    • None

    Description

      ScalaReflection.dataTypeFor fails when Array[T] has been aliased for some T, and then the alias is being used as a component of e.g. a product.

      Minimal example, tested in version 3.1.2:

      type Data = Array[Long]
      val xs:List[(Data,Int)] = List((Array(1),1), (Array(2),2))
      sc.parallelize(xs).toDF("a", "b")

      This gives the following exception:

      scala.MatchError: Data (of class scala.reflect.internal.Types$AliasNoArgsTypeRef) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$dataTypeFor$1(ScalaReflection.scala:104) 
       at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:69) 
       at org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects(ScalaReflection.scala:904) 
       at org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects$(ScalaReflection.scala:903) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:49) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.dataTypeFor(ScalaReflection.scala:88) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$serializerFor$6(ScalaReflection.scala:573) 
       at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) 
       at scala.collection.immutable.List.foreach(List.scala:392) 
       at scala.collection.TraversableLike.map(TraversableLike.scala:238) 
       at scala.collection.TraversableLike.map$(TraversableLike.scala:231) 
       at scala.collection.immutable.List.map(List.scala:298) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$serializerFor$1(ScalaReflection.scala:562) 
       at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:69) 
       at org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects(ScalaReflection.scala:904) 
       at org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects$(ScalaReflection.scala:903) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:49) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.serializerFor(ScalaReflection.scala:432) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$serializerForType$1(ScalaReflection.scala:421) 
       at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:69) 
       at org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects(ScalaReflection.scala:904) 
       at org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects$(ScalaReflection.scala:903) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:49) 
       at org.apache.spark.sql.catalyst.ScalaReflection$.serializerForType(ScalaReflection.scala:413) 
       at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:55) 
       at org.apache.spark.sql.Encoders$.product(Encoders.scala:285) 
       at org.apache.spark.sql.LowPrioritySQLImplicits.newProductEncoder(SQLImplicits.scala:251) 
       at org.apache.spark.sql.LowPrioritySQLImplicits.newProductEncoder$(SQLImplicits.scala:251) 
       at org.apache.spark.sql.SQLImplicits.newProductEncoder(SQLImplicits.scala:32) 
       ... 48 elided

      At first glance, I think this could be fixed by changing e.g.

      getClassNameFromType(tpe) to 
      getClassNameFromType(tpe.dealias)
      

      in ScalaReflection.dataTypeFor. I will try to test that and submit a pull request shortly.

       

       

      Attachments

        Activity

          People

            jtnystrom Johan Nyström-Persson
            jtnystrom Johan Nyström-Persson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: