Avro
  1. Avro
  2. AVRO-345

Optimization for ResolvingDecoder

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: java
    • Labels:
      None

      Description

      If the writer's and reader's schema are records, we allow the order of fields to be different in them. The ResolvingDecoder returns the fields in the writer's order. This is to avoid buffering. The current implementation uses FieldAdjust action on the parser stack. The number of such action symbols is equal to the number of reader fields. This causes performance problem because the number of calls to advance() is almost double compared to the number of calls in ValidatingDecoder.

      This patch replaces the FieldAdjustAction symbols with FieldOrderAction symbols. We have FieldOrderActions one per record, which is expected to be much smaller than the number of fields. Though it changes the API for ResolvingDecoder slightly, there is no impact because we do not have any users yet.

      I see a 10-15% improvement in my computer with Perf -S.

      1. AVRO-345.patch
        10 kB
        Thiruvalluvan M. G.
      2. AVRO-345-test.patch
        3 kB
        Thiruvalluvan M. G.

        Activity

        Thiruvalluvan M. G. created issue -
        Thiruvalluvan M. G. made changes -
        Field Original Value New Value
        Attachment AVRO-345.patch [ 12430502 ]
        Attachment AVRO-345-test.patch [ 12430503 ]
        Thiruvalluvan M. G. made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee Thiruvalluvan M. G. [ thiru_mg ]
        Hide
        Doug Cutting added a comment -

        +1 This looks good to me.

        We talked about re-writing GenericDatumReader to use ResolvingDecoder as an optimization. Did that end up being faster? If so, have you yet filed an issue for that?

        Show
        Doug Cutting added a comment - +1 This looks good to me. We talked about re-writing GenericDatumReader to use ResolvingDecoder as an optimization. Did that end up being faster? If so, have you yet filed an issue for that?
        Hide
        Thiruvalluvan M. G. added a comment -

        Committed revision 900790. Thanks Doug for reviewing it.

        We talked about re-writing GenericDatumReader to use ResolvingDecoder as an optimization. Did that end up being faster? If so, have you yet filed an issue for that?

        I didn't see measurable improvement with GenericDatumReader using ResolvingDecoder. I'll profile and if I can get some result, I'll file a JIRA.

        Show
        Thiruvalluvan M. G. added a comment - Committed revision 900790. Thanks Doug for reviewing it. We talked about re-writing GenericDatumReader to use ResolvingDecoder as an optimization. Did that end up being faster? If so, have you yet filed an issue for that? I didn't see measurable improvement with GenericDatumReader using ResolvingDecoder. I'll profile and if I can get some result, I'll file a JIRA.
        Thiruvalluvan M. G. made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Doug Cutting made changes -
        Fix Version/s 1.3.0 [ 12314318 ]
        Doug Cutting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Thiruvalluvan M. G.
            Reporter:
            Thiruvalluvan M. G.
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development