Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Data flow in Avro is currently broken into two parts: Read and Write. These share many common patterns but almost no common code.
Additionally, the APIs for this are DatumReader and DatumWriter, which requires that implementations know how to traverse Schemas and use the Resolver.
This is a proposal to overhaul the inner workings of Avro Java between the Decoder/Encoder APIs and DatumReader/DatumWriter such that there is significantly more code re-use and much greater opportunity for new features that can all share in general optimizations and dynamic code generation.
The two primary concepts involved are:
- Functional Composition
- Symmetry
Functional Composition
All read and write operations can be broken into functional bits and composed rather than writing monolithic classes. This allows a "DatumWriter2" to be a graph of functions that pre-compute all state required from a schema rather than traverse a schema for each write.
Symmetry
Avro's data flow can be made symmetric. Rather than thinking in terms of Read and Write, think in terms of:
- Source: Where data that is represented by an Avro schema comes from – this may be a Decoder, or an Object graph.
- Target: Where data that represents an Avro schema is sent – this may be an Encoder or an Object graph.
(More detail in the comments)