Description
When deserializing protobuf enum fields, the spark-protobuf library will deserialize them as string values based on the enum name in the proto. E.g.
message Person {
enum Job {
NOTHING = 0;
ENGINEER = 1;
DOCTOR = 2;
}
Job job = 1;
}
And we have a message like
Person(job=ENGINEER)
Then the deserialized value will be:
{"job": "ENGINEER"}
However it can be useful to deserialize the enum integer value rather than the name (and this option exists in other major libraries). So, namely:
{"job": 1}
Examples in other libraries:
- protobuf-java-util JsonFormat: https://javadoc.io/doc/com.google.protobuf/protobuf-java-util/3.10.0/com/google/protobuf/util/JsonFormat.Printer.html#printingEnumsAsInts--
- golang/protobuf jsonpb marshaler https://pkg.go.dev/github.com/golang/protobuf/jsonpb#Marshaler
I propose extending spark-protobuf to add this functionality.