Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Apache Gobblin 170807, Apache Gobblin 170821, Apache Gobblin 170905
Description
Why
The current implementation of EnvelopeSchemaConverter has several flaws:
- Assumes top level payload schema field
- Output record is the schema'ed payload but output schema is a String
To address the issues and improve envelope schema conversion, the task implements two types of EnvelopeSchemaConverter: EnvelopePayloadExtractor and EnvelopePayloadDeserializer.
EnvelopePayloadExtractor
This is a replacement of the deprecated `EvenlopeSchemaConverter`. Given an envelope record, the output schema will be the latest payload schema fetched from a kafka registry. The output record will be the deserialized payload with the latest schema
EnvelopePayloadDeserializer
Given an envelope record, the output schema will set the payload field to have the latest schema fetched from a kafka registry and set the other fields as they are from the input schema. The output record will set the payload to be the deserialized object with the latest schema and set the other fields as they are from the input record
Configurations
One configuration is required to set for any of the converters to work. It has no default value.
// The topic to fetch the latest schema of the payload from a kafka registry
converter.envelopeSchemaConverter.payloadSchemaTopic=
The converter supports nested schema id
converter.envelopeSchemaConverter.schemaIdField="metadata.payloadSchemaId"
Attachments
Issue Links
- is a clone of
-
GOBBLIN-87 Gobblin runOnce not working correctly
- Resolved