Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-859

Java: Data Flow Overhaul -- Composition and Symmetry

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • java
    • None

    Description

      Data flow in Avro is currently broken into two parts: Read and Write. These share many common patterns but almost no common code.
      Additionally, the APIs for this are DatumReader and DatumWriter, which requires that implementations know how to traverse Schemas and use the Resolver.

      This is a proposal to overhaul the inner workings of Avro Java between the Decoder/Encoder APIs and DatumReader/DatumWriter such that there is significantly more code re-use and much greater opportunity for new features that can all share in general optimizations and dynamic code generation.

      The two primary concepts involved are:

      • Functional Composition
      • Symmetry

      Functional Composition

      All read and write operations can be broken into functional bits and composed rather than writing monolithic classes. This allows a "DatumWriter2" to be a graph of functions that pre-compute all state required from a schema rather than traverse a schema for each write.

      Symmetry

      Avro's data flow can be made symmetric. Rather than thinking in terms of Read and Write, think in terms of:

      • Source: Where data that is represented by an Avro schema comes from – this may be a Decoder, or an Object graph.
      • Target: Where data that represents an Avro schema is sent – this may be an Encoder or an Object graph.

      (More detail in the comments)

      Attachments

        Activity

          People

            scott_carey Scott Carey
            scott_carey Scott Carey
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: