Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29594

Create a Dataset from a Sequence of Case class

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.4.4
    • None
    • SQL
    • None

    Description

      The Dataset code generation logic fails to handle field-names in case classes (e.g. "1_something"). Scala has an escaping mechanism (using backquotes) that allows Java (and Scala) keywords to be used as names in programs, as in the example below:

       

      case class Foo(`1_something`: String)

       

      val test = Seq(Foo("HelloWorld!")).toDS()

      But this case class trips up the Dataset code generator. The following error message is displayed when Datasets containing instances of such case classes are processed.

      java.lang.RuntimeException: Error while encoding: java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 316, Column 15: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 316, Column 15: Expression "funcResult_2 = value_19" is not a type[0m
      [31mstaticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, unwrapoption(ObjectType(class java.lang.String), assertnotnull(assertnotnull(input[0, Foo, true])).1_something), true, false) AS 1_something#40[0m

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              PedroCorreiaLuis Pedro Correia Luis
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: