Avro
  1. Avro
  2. AVRO-895

JsonDecoder does not tolerate JSON records with different field order

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.4
    • Fix Version/s: 1.6.0
    • Component/s: java
    • Labels:

      Description

      Avro records are represented as JSON objects in the JSON encoding (according to the Avro spec http://avro.apache.org/docs/1.5.4/spec.html#schema_record). The JSON spec (http://www.json.org/) states that "A [JSON] object is an unordered set of name/value pairs". However, the Java JsonDecoder fails to read a JSON object as an Avro record if its fields are not in the same order as the schema.

      1. AVRO-895.patch
        1 kB
        Tom White
      2. AVRO-895.patch
        16 kB
        Thiruvalluvan M. G.
      3. AVRO-895-rev-2.patch
        17 kB
        Thiruvalluvan M. G.

        Activity

        Hide
        Thiruvalluvan M. G. added a comment -

        Committed revision 1178066.

        Thanks Tom for reporting the issue and reviewing the fix.

        Show
        Thiruvalluvan M. G. added a comment - Committed revision 1178066. Thanks Tom for reporting the issue and reviewing the fix.
        Hide
        Tom White added a comment -

        +1 looks good to me.

        Show
        Tom White added a comment - +1 looks good to me.
        Hide
        Thiruvalluvan M. G. added a comment -

        Cleaner fix. No new interface. Took care of Tom's suggestions.

        Show
        Thiruvalluvan M. G. added a comment - Cleaner fix. No new interface. Took care of Tom's suggestions.
        Hide
        Thiruvalluvan M. G. added a comment -

        JsonParser2 is actually a narrower version of Jackson's JsonParser. Actually there is no need for that interface. I use it just to avoid implementing all the methods of JsonParser for the saved tokens. Now looking back, it doesn't seem to be worth it. It forces us to have an adaptor to adapt JsonParser into JsonParser2, which appears wasteful.I'll get rid of that JsonParser2 altogether.

        I have no issues with removing the underscores in test names.

        Show
        Thiruvalluvan M. G. added a comment - JsonParser2 is actually a narrower version of Jackson's JsonParser. Actually there is no need for that interface. I use it just to avoid implementing all the methods of JsonParser for the saved tokens. Now looking back, it doesn't seem to be worth it. It forces us to have an adaptor to adapt JsonParser into JsonParser2, which appears wasteful.I'll get rid of that JsonParser2 altogether. I have no issues with removing the underscores in test names.
        Hide
        Tom White added a comment -

        Thanks Thiru - this patch fixes the problem for me.

        A couple of minor naming suggestions: Change JsonParser2 to ReorderingJsonParser or some such; also no need for the underscores in the test names in TestEncoders.

        Show
        Tom White added a comment - Thanks Thiru - this patch fixes the problem for me. A couple of minor naming suggestions: Change JsonParser2 to ReorderingJsonParser or some such; also no need for the underscores in the test names in TestEncoders.
        Hide
        Thiruvalluvan M. G. added a comment -

        This patch addresses the problem. A few new tests added that tests the fixes.

        The fix essentially saves the JSON fields that arrive out of order and uses the saved fields later. If the fields arrive in right order, it does not store them. So performance impact (both in terms of memory and time) due to this patch should be insignificant if the fields are in order.

        Show
        Thiruvalluvan M. G. added a comment - This patch addresses the problem. A few new tests added that tests the fixes. The fix essentially saves the JSON fields that arrive out of order and uses the saved fields later. If the fields arrive in right order, it does not store them. So performance impact (both in terms of memory and time) due to this patch should be insignificant if the fields are in order.
        Hide
        Doug Cutting added a comment -

        This is has been true since JSON input was first added in AVRO-50. Thiru mentions it there in a comment (http://s.apache.org/E2N). Thiru, is it possible to fix this in the resolver or parser? The field name is known before it is parsed.

        Show
        Doug Cutting added a comment - This is has been true since JSON input was first added in AVRO-50 . Thiru mentions it there in a comment ( http://s.apache.org/E2N ). Thiru, is it possible to fix this in the resolver or parser? The field name is known before it is parsed.
        Hide
        Tom White added a comment -

        Here's a failing test which demonstrates the problem.

        Show
        Tom White added a comment - Here's a failing test which demonstrates the problem.

          People

          • Assignee:
            Thiruvalluvan M. G.
            Reporter:
            Tom White
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development