Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-2438

SpecificData.deepCopy() cannot be used with URI fields

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.9.0, 1.8.2
    • 1.10.0
    • java
    • None

    Description

      Having a schema fragment like this:

      {
      "name": "ownerId",
      "type": [
        "null",
        {
          "type": "string",
          "java-class": "java.net.URI"
        }
      ],
      "default": null
      }

      can be perfectly deserialized in a generated POJO with

      @org.apache.avro.specific.AvroGenerated
      public class MyAvroDataObject extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
      ...
      @Deprecated public java.net.URI ownerId;

      as 

      GenericDatumReader.readString(Object, Schema, Decoder) uses via the stringClassCache with 

      {"type":"string","java-class":"java.net.URI"}=class java.net.URI

      The URI class itself to rehydrate the value via newInstanceFromString.

       

      On the other hand, deepCopy only considers the schema-type of the field and turns in org.apache.avro.generic.GenericData.deepCopy(Schema, T)

      the URI value into an org.apache.avro.util.Utf8 via the String case which then causes a ClassCastException:

      java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to java.net.URI
        at com.example.MyAvroDataObject.put(MyAvroDataObject.java:104)
        at org.apache.avro.generic.GenericData.setField(GenericData.java:660)
        at org.apache.avro.generic.GenericData.setField(GenericData.java:677)
        at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1082)
        at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1102)
        at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1080)

       

      The following dirty hack seems to avoid the issue - but is not in sync with the stringClassCache which should be consulted, too:

      case STRING:
        // Strings are immutable
        if (value instanceof String) {
          return (T)value;
        }
      
      
        // Dirty Harry 9 3/4 start
        // URIs are immutable and are probably modeled as an URI itself 
        // TODO: Check with stringClassCache & the schema
        else if ((value instanceof URI)
          && URI.class.getName().equals(schema.getProp("java-class"))
          ) {
          return (T)value;
        }
        // Dirt Harry 9 3/4 end
      
      
        // Some CharSequence subclasses are mutable, so we still need to make
        // a copy
        else if (value instanceof Utf8) {
          // Utf8 copy constructor is more efficient than converting
          // to string and then back to Utf8
          return (T)new Utf8((Utf8)value);
        }
        return (T)new Utf8(value.toString());
      

       

      Also tried with Avro 1.10-SNAPSHOT of 2019-06-20 / 2d3b1fe7efd865639663ba785877182e7e038c45 due to https://github.com/apache/avro/pull/329 - but the issue remains.

      Attachments

        Issue Links

          Activity

            People

              zeshuai007 Zezeng Wang
              zeeman Sebastian J.
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: