Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-2999

Optimize Ruby union serialization

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.10.0
    • Fix Version/s: 1.11.0, 1.10.2
    • Component/s: ruby
    • Labels:
      None

      Description

      Profiling Avro serialization in our union heavy schema shows some memory and throughput bottlenecks:

      • Validation calls repeatedly allocate constant hashes
      • Validation calls repeatedly allocate constant strings
      • Validation calls are expensive and can be avoided when determining of a datum matches a null union member type (a common pattern for "optional" fields)

      Optimizing these codepaths reduces memory allocations by 78% and improves throughput 1.9X in our encoding benchmarks. A Github PR is coming shortly.

      Note: Encoding unions is still expensive because the code must determine which member of the union a datum is targeting. Allowing clients to explicitly specify this would speed up serialization even further but that requires a larger API change.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                joelturkel Joel Turkel
                Reporter:
                joelturkel Joel Turkel
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: