Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.4.4
-
None
-
None
Description
The Dataset code generation logic fails to handle field-names in case classes (e.g. "1_something"). Scala has an escaping mechanism (using backquotes) that allows Java (and Scala) keywords to be used as names in programs, as in the example below:
case class Foo(`1_something`: String)
val test = Seq(Foo("HelloWorld!")).toDS()
But this case class trips up the Dataset code generator. The following error message is displayed when Datasets containing instances of such case classes are processed.
java.lang.RuntimeException: Error while encoding: java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 316, Column 15: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 316, Column 15: Expression "funcResult_2 = value_19" is not a type[0m
[31mstaticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, unwrapoption(ObjectType(class java.lang.String), assertnotnull(assertnotnull(input[0, Foo, true])).1_something), true, false) AS 1_something#40[0m
Attachments
Issue Links
- duplicates
-
SPARK-31416 Check more strictly that a field name can be used as a valid Java identifier for codegen
-
- Resolved
-
- links to