Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-2247

Improve Java reading performance with a new reader

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.9.2
    • java
    • None

    Description

      Complementary to AVRO-2090, I have been working on decoding of Avro objects in Java and am suggesting a new implementation of a DatumReader that improves read performance for both generic and specific records by approximately 20% (and even more in cases of nested objects with defaults, a case I encounter a lot in practical use).

      Key concept is to create a detailed execution plan once at DatumReader. This execution plan contains all required defaulting/lookup values so they need not be looked up during object traversal while reading.

      The reader implementation can be enabled and disabled per GenericData instance. The system default is set via the system variable "org.apache.avro.fastread" (defaults to "false").

      Attached a performance comparison of the existing implementation with the proposed one. Will open a pull request with respective code in a bit (not including interoperability with the optimizations of AVRO-2090 yet). Please let me know your opinion of whether this is worth pursuing further.

       

      Attachments

        1. Perf-Comparison.md
          2 kB
          Martin Jubelgas

        Issue Links

          Activity

            People

              unchuckable Martin Jubelgas
              unchuckable Martin Jubelgas
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: