Avro
  1. Avro
  2. AVRO-1201

GenericRecord.toString can produce invalid JSON

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.2
    • Fix Version/s: 1.7.3
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The GenericData.toString method can produce invalid JSON. The problem is that enums are printed without quotes, as in

      {"emum_field":a}

      instead of

      {"enum_field":"a"}
      1. vcs-diff4833874908267370412.patch
        17 kB
        Sharmarke Aden
      2. avro-1201.patch
        2 kB
        Sharmarke Aden
      3. avro-1201.patch
        0.7 kB
        Sharmarke Aden
      4. Avro-1201-TestCase.tar.gz
        2 kB
        Sharmarke Aden

        Issue Links

          Activity

          Hide
          Sharmarke Aden added a comment -

          GenericData toString() method doesn't check to see if "datum instanceof Enum" and appropriately handle enums. This results in an invalid json being produced similar to the one seen in AVRO-713.

          Show
          Sharmarke Aden added a comment - GenericData toString() method doesn't check to see if "datum instanceof Enum" and appropriately handle enums. This results in an invalid json being produced similar to the one seen in AVRO-713 .
          Hide
          Sharmarke Aden added a comment -

          Here is a test case that demonstrates the issue.

          Show
          Sharmarke Aden added a comment - Here is a test case that demonstrates the issue.
          Hide
          Sharmarke Aden added a comment -

          Here is a patch for GenericData.java class that resolves the issue. I am not sure how you test patch without using the maven plugin and generating beans.

          Show
          Sharmarke Aden added a comment - Here is a patch for GenericData.java class that resolves the issue. I am not sure how you test patch without using the maven plugin and generating beans.
          Hide
          Doug Cutting added a comment -

          I wonder if instead we should change this test to:

           isString(datum) || isEnum(datum)
          

          Similarly, we should probably also change this method to use isFixed(), isMap(), isArray(), isBytes(), etc.

          Show
          Doug Cutting added a comment - I wonder if instead we should change this test to: isString(datum) || isEnum(datum) Similarly, we should probably also change this method to use isFixed(), isMap(), isArray(), isBytes(), etc.
          Hide
          Sharmarke Aden added a comment -

          That works even better.

          Show
          Sharmarke Aden added a comment - That works even better.
          Hide
          Sharmarke Aden added a comment -

          Here's an updated patch that contains changes suggested by Doug.

          Show
          Sharmarke Aden added a comment - Here's an updated patch that contains changes suggested by Doug.
          Hide
          Sharmarke Aden added a comment -

          I am wondering if this patch is acceptable and will get committed.

          Show
          Sharmarke Aden added a comment - I am wondering if this patch is acceptable and will get committed.
          Hide
          Doug Cutting added a comment -

          Sorry, I lost track of this one.

          Some comments on the patch:

          • The toString() method should probably use isX() methods througout, so isMap(), isFixed(), isArray(), isString(), and isEnum().
          • I don't think we should change the definitions of isEnum() in GenericData and SpecificData. If you're calling toString() on data that might include java.lang.Enum instances then you should use SpecificData or ReflectData, not GenericData.
          • We should add a test for the correct toString() of enums.
          Show
          Doug Cutting added a comment - Sorry, I lost track of this one. Some comments on the patch: The toString() method should probably use isX() methods througout, so isMap(), isFixed(), isArray(), isString(), and isEnum(). I don't think we should change the definitions of isEnum() in GenericData and SpecificData. If you're calling toString() on data that might include java.lang.Enum instances then you should use SpecificData or ReflectData, not GenericData. We should add a test for the correct toString() of enums.
          Hide
          Sharmarke Aden added a comment -

          Here is an updated patch that contains Doug's recommendations. I also took the liberty to change StringBuffer reference to StringBuiler and add Overridea annotations

          Show
          Sharmarke Aden added a comment - Here is an updated patch that contains Doug's recommendations. I also took the liberty to change StringBuffer reference to StringBuiler and add Overridea annotations
          Hide
          Doug Cutting added a comment -

          I committed this.

          Show
          Doug Cutting added a comment - I committed this.

            People

            • Assignee:
              Sharmarke Aden
              Reporter:
              Sharmarke Aden
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development