Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-2999

Optimize Ruby union serialization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.10.0
    • 1.11.0, 1.10.2
    • ruby
    • None

    Description

      Profiling Avro serialization in our union heavy schema shows some memory and throughput bottlenecks:

      • Validation calls repeatedly allocate constant hashes
      • Validation calls repeatedly allocate constant strings
      • Validation calls are expensive and can be avoided when determining of a datum matches a null union member type (a common pattern for "optional" fields)

      Optimizing these codepaths reduces memory allocations by 78% and improves throughput 1.9X in our encoding benchmarks. A Github PR is coming shortly.

      Note: Encoding unions is still expensive because the code must determine which member of the union a datum is targeting. Allowing clients to explicitly specify this would speed up serialization even further but that requires a larger API change.

      Attachments

        Issue Links

          Activity

            People

              joelturkel Joel Turkel
              joelturkel Joel Turkel
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: