Description
The issue occurs when we run a .map over a dataset containing Case Class with a List in it. A self contained test case is below:
case class TestCC(key: Int, letters: List[String]) //List causes the issue - a Seq/Array works fine
/simple test data/
val ds1 = sc.makeRDD(Seq(
(List("D")),
(List("S","H")),
(List("F","H")),
(List("D","L","L"))
)).map(x=>(x.length,x)).toDF("key","letters").as[TestCC]
//This will fail
val test1=ds1.map{_.key}
test1.show
Error:
Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 72, Column 70: No applicable constructor/method found for actual parameters "int, scala.collection.Seq"; candidates are: "TestCC(int, scala.collection.immutable.List)"
It seems to be internally converting the List to a sequence, then it cant convert it back...
If you change the List[String] to Seq[String] or Array[String] the issue doesnt appear
Attachments
Issue Links
- is duplicated by
-
SPARK-22296 CodeGenerator - failed to compile when constructor has scala.collection.mutable.Seq vs. scala.collection.Seq
- Resolved
- is required by
-
SPARK-19088 Optimize sequence type deserialization codegen
- Resolved
- relates to
-
SPARK-16815 Dataset[List[T]] leads to ArrayStoreException
- Resolved
- links to