Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16792

Dataset containing a Case Class with a List type causes a CompileException (converting sequence to list)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.0.0
    • 2.2.0
    • SQL
    • None

    Description

      The issue occurs when we run a .map over a dataset containing Case Class with a List in it. A self contained test case is below:

      case class TestCC(key: Int, letters: List[String]) //List causes the issue - a Seq/Array works fine

      /simple test data/
      val ds1 = sc.makeRDD(Seq(
      (List("D")),
      (List("S","H")),
      (List("F","H")),
      (List("D","L","L"))
      )).map(x=>(x.length,x)).toDF("key","letters").as[TestCC]

      //This will fail
      val test1=ds1.map{_.key}
      test1.show

      Error:

      Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 72, Column 70: No applicable constructor/method found for actual parameters "int, scala.collection.Seq"; candidates are: "TestCC(int, scala.collection.immutable.List)"

      It seems to be internally converting the List to a sequence, then it cant convert it back...

      If you change the List[String] to Seq[String] or Array[String] the issue doesnt appear

      Attachments

        Issue Links

          Activity

            People

              michalsenkyr Michal Šenkýř
              jamiehutton Jamie Hutton
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: