Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Cannot Reproduce
-
2.0.1
-
None
-
None
-
x86_64 GNU/Linux
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Description
Hello,
I am seeing an error message in spark-shell when I map a DataFrame to a Seq[Foo]. However, things work fine when I use flatMap.
scala> case class Foo(value:String) defined class Foo scala> val df = sc.parallelize(List(1,2,3)).toDF df: org.apache.spark.sql.DataFrame = [value: int] scala> df.map{x => Seq.empty[Foo]} scala.ScalaReflectionException: object $line14.$read not found. at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:162) at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:22) at $typecreator1$1.apply(<console>:29) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232) at org.apache.spark.sql.SQLImplicits$$typecreator9$1.apply(SQLImplicits.scala:125) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:49) at org.apache.spark.sql.SQLImplicits.newProductSeqEncoder(SQLImplicits.scala:125) ... 48 elided scala> df.flatMap{_ => Seq.empty[Foo]} //flatMap works res2: org.apache.spark.sql.Dataset[Foo] = [value: string]
I am seeing the same error reported here when I use spark-submit.
I am new to Spark but I don't expect this to throw an exception.
Thanks.
Attachments
Issue Links
- duplicates
-
SPARK-18055 Dataset.flatMap can't work with types from customized jar
- Resolved