[SPARK-22442] Schema generated by Product Encoder doesn't match case class field name when using non-standard characters - ASF JIRA

XML

Word

Printable

JSON

Product encoder encodes special characters wrongly when field name contains certain nonstandard characters.

For example for:

case class MyType(`field.1`: String, `field 2`: String)

we will get the following schema

root

– field$u002E1: string (nullable = true)

– field$u00202: string (nullable = true)

As a consequence of this issue a DataFrame with the correct schema can't be converted to a Dataset using .as[MyType]

links to

[Github] Pull Request #19664 (viirya)