Description
Because `Encoder` is not thread safe, the user cannot reuse an `Encoder` in multiple `Dataset`s. However, creating an `Encoder` for a complicated class is slow due to Scala reflections. To reduce the cost of Encoder creation, right now I usually use the private API `ExpressionEncoder.copy` as follows:
object FooEncoder {
private lazy val _encoder: ExpressionEncoder[Foo] = ExpressionEncoder[Foo]()
implicit def encoder: ExpressionEncoder[Foo] = _encoder.copy()
}
This PR proposes a new method `makeCopy` in `Encoder` so that the above codes can be rewritten using public APIs.
object FooEncoder {
private lazy val _encoder: Encoder[Foo] = Encoders.product[Foo]()
implicit def encoder: Encoder[Foo] = _encoder.makeCopy
}
Attachments
Issue Links
- relates to
-
SPARK-29419 Seq.toDS / spark.createDataset(Seq) is not thread-safe
- Resolved
- links to