Avro
  1. Avro
  2. AVRO-247

Broken test: org.apache.avro.TestDataFile$InteropTest.testGeneratedReflect()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The current trunk has a broken java-interop test.

      1. AVRO-247.patch
        10 kB
        Doug Cutting

        Issue Links

          Activity

          Hide
          Philip Zeyliger added a comment -

          "ant test-interop-java" triggers it; here's the relevant exception:

          $cat build/test/TEST*Interop*
          Testsuite: org.apache.avro.TestDataFile$InteropTest
          Tests run: 3, Failures: 0, Errors: 1, Time elapsed: 0.461 sec
          ------------- Standard Output ---------------
          Reading test.java.avro
          Reading test.py.avro
          Reading test.java.avro
          Reading test.py.avro
          Reading test.java.avro
          ------------- ---------------- ---------------
          
          Testcase: testGeneratedGeneric took 0.274 sec
          Testcase: testGeneratedSpecific took 0.128 sec
          Testcase: testGeneratedReflect took 0.03 sec
              Caused an ERROR
          Can not set org.apache.avro.util.Utf8 field org.apache.avro.Interop.stringField to java.lang.String
          java.lang.IllegalArgumentException: Can not set org.apache.avro.util.Utf8 field org.apache.avro.Interop.stringField to java.lang.String
              at java.lang.reflect.Field.set(Field.java:657)
              at org.apache.avro.reflect.ReflectDatumReader.addField(ReflectDatumReader.java:53)
              at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:154)
              at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72)
              at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61)
              at org.apache.avro.file.DataFileReader.next(DataFileReader.java:143)
              at org.apache.avro.TestDataFile.readFile(TestDataFile.java:133)
              at org.apache.avro.TestDataFile$InteropTest.readFiles(TestDataFile.java:171)
              at org.apache.avro.TestDataFile$InteropTest.testGeneratedReflect(TestDataFile.java:165)
          
          Show
          Philip Zeyliger added a comment - "ant test-interop-java" triggers it; here's the relevant exception: $cat build/test/TEST*Interop* Testsuite: org.apache.avro.TestDataFile$InteropTest Tests run: 3, Failures: 0, Errors: 1, Time elapsed: 0.461 sec ------------- Standard Output --------------- Reading test.java.avro Reading test.py.avro Reading test.java.avro Reading test.py.avro Reading test.java.avro ------------- ---------------- --------------- Testcase: testGeneratedGeneric took 0.274 sec Testcase: testGeneratedSpecific took 0.128 sec Testcase: testGeneratedReflect took 0.03 sec Caused an ERROR Can not set org.apache.avro.util.Utf8 field org.apache.avro.Interop.stringField to java.lang.String java.lang.IllegalArgumentException: Can not set org.apache.avro.util.Utf8 field org.apache.avro.Interop.stringField to java.lang.String at java.lang.reflect.Field.set(Field.java:657) at org.apache.avro.reflect.ReflectDatumReader.addField(ReflectDatumReader.java:53) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:154) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61) at org.apache.avro.file.DataFileReader.next(DataFileReader.java:143) at org.apache.avro.TestDataFile.readFile(TestDataFile.java:133) at org.apache.avro.TestDataFile$InteropTest.readFiles(TestDataFile.java:171) at org.apache.avro.TestDataFile$InteropTest.testGeneratedReflect(TestDataFile.java:165)
          Hide
          Doug Cutting added a comment -

          This test is using a schema generated via reflection to process data written with a different, incompatible schema. The reflected schema includes a record named org.apache.avro.util.Utf8 that contains a byte[] field where the schema in the file contains simply "string". Perhaps to fix this we should add an @Stringable annotation used by reflection that indicates that a class should be represented by an Avro string. The toString() method can be used to write instances, and its one-String-argument constructor can be used when reading. Then the specific generator can add this annotation, so specific data can be correctly reflected. Does this make sense?

          Another alternative would be to abandon Utf8 in specific and reflect, but that would compromise performance, since not only can Java strings not be reused, but they force UTF-8 processing on creation rather than permitting it to be delayed until a Java String is needed.

          Show
          Doug Cutting added a comment - This test is using a schema generated via reflection to process data written with a different, incompatible schema. The reflected schema includes a record named org.apache.avro.util.Utf8 that contains a byte[] field where the schema in the file contains simply "string". Perhaps to fix this we should add an @Stringable annotation used by reflection that indicates that a class should be represented by an Avro string. The toString() method can be used to write instances, and its one-String-argument constructor can be used when reading. Then the specific generator can add this annotation, so specific data can be correctly reflected. Does this make sense? Another alternative would be to abandon Utf8 in specific and reflect, but that would compromise performance, since not only can Java strings not be reused, but they force UTF-8 processing on creation rather than permitting it to be delayed until a Java String is needed.
          Hide
          Doug Cutting added a comment -

          Note that reflection would need AVRO-242 to annotate the schema with the class that implements the string, so that it knows what to create.

          Show
          Doug Cutting added a comment - Note that reflection would need AVRO-242 to annotate the schema with the class that implements the string, so that it knows what to create.
          Hide
          Philip Zeyliger added a comment -

          It seems that the same way that we special case reflection to stop at java.lang.*, we should perhaps special case it to stop at org.apache.avro.util.Utf8? Alternately, wouldn't the @Stringable annotation go on the UTF8 class?

          It does seem desirable that generated code is able to work with the reflection API.

          Show
          Philip Zeyliger added a comment - It seems that the same way that we special case reflection to stop at java.lang.*, we should perhaps special case it to stop at org.apache.avro.util.Utf8? Alternately, wouldn't the @Stringable annotation go on the UTF8 class? It does seem desirable that generated code is able to work with the reflection API.
          Hide
          Doug Cutting added a comment -

          > wouldn't the @Stringable annotation go on the UTF8 class?

          Oops. You're right. The specific code generator would not need to be changed.

          > we should perhaps special case it to stop at org.apache.avro.util.Utf8

          The problem with having it stop is that you may not read or write critical fields. For exceptions in reflected protocols this may be okay, since they have null constructors and mostly don't contain critical data. But for, e.g., org.apache.hadoop.io.Path, we currently skip its nested java.lang.URI, which is a bug. Rather we should annotate Path with @Stringable.

          Show
          Doug Cutting added a comment - > wouldn't the @Stringable annotation go on the UTF8 class? Oops. You're right. The specific code generator would not need to be changed. > we should perhaps special case it to stop at org.apache.avro.util.Utf8 The problem with having it stop is that you may not read or write critical fields. For exceptions in reflected protocols this may be okay, since they have null constructors and mostly don't contain critical data. But for, e.g., org.apache.hadoop.io.Path, we currently skip its nested java.lang.URI, which is a bug. Rather we should annotate Path with @Stringable.
          Hide
          Philip Zeyliger added a comment -

          > Rather we should annotate Path with @Stringable.

          Agreed. The most general case will be an annotation like @AvroConverter(SomeConverter.class). Besides String, using something for Fixed(4) is also going to crop up. (Checksums are fixed64, whereas a lot of other things are varint.)

          Show
          Philip Zeyliger added a comment - > Rather we should annotate Path with @Stringable. Agreed. The most general case will be an annotation like @AvroConverter(SomeConverter.class). Besides String, using something for Fixed(4) is also going to crop up. (Checksums are fixed64, whereas a lot of other things are varint.)
          Hide
          Doug Cutting added a comment -

          Here's a patch that defines a Stringable annotation and adds a test for it.

          It also fixes the interop test, but unfortunately not by using the Stringable annotation, but rather by commenting out the failing test case. This bug was created when specific and reflect diverged in their representation of Avro strings. Specific uses Utf8 while reflect uses java.lang.String. The generated specific code has a field whose type is Utf8 that we cannot assign a java.lang.String to. We could define a version of Interop.java for reflect, but then we'd need to load it with a different class loader. I don't think this is worthwhile. Testing interoperability of specific and generic with python, c, etc. is sufficient. Thoughts?

          Show
          Doug Cutting added a comment - Here's a patch that defines a Stringable annotation and adds a test for it. It also fixes the interop test, but unfortunately not by using the Stringable annotation, but rather by commenting out the failing test case. This bug was created when specific and reflect diverged in their representation of Avro strings. Specific uses Utf8 while reflect uses java.lang.String. The generated specific code has a field whose type is Utf8 that we cannot assign a java.lang.String to. We could define a version of Interop.java for reflect, but then we'd need to load it with a different class loader. I don't think this is worthwhile. Testing interoperability of specific and generic with python, c, etc. is sufficient. Thoughts?
          Hide
          Philip Zeyliger added a comment -

          +1.

          In testR10, it would be reasonable to check 'assertEquals("...R10", r10schema.getProperty(CLASS_PROP))'.

          c.isAnnotationPresent(Stringable.class)

          For fail-fast, it might be good to check that c has a single string constructor. Obviously, the class reading might be totally different, but seems like this would catch errors in the common case. Up to you.

          Interop.java

          I'm ok with having generated code not work with the reflection interface. You could, again, use annotations to make that work. (@Utf8able on the Utf8 class, so that Avro reflection understands both String and Utf8 might work.) But I'm ok with it just not working.

          Show
          Philip Zeyliger added a comment - +1. In testR10, it would be reasonable to check 'assertEquals("...R10", r10schema.getProperty(CLASS_PROP))'. c.isAnnotationPresent(Stringable.class) For fail-fast, it might be good to check that c has a single string constructor. Obviously, the class reading might be totally different, but seems like this would catch errors in the common case. Up to you. Interop.java I'm ok with having generated code not work with the reflection interface. You could, again, use annotations to make that work. (@Utf8able on the Utf8 class, so that Avro reflection understands both String and Utf8 might work.) But I'm ok with it just not working.
          Hide
          Doug Cutting added a comment -

          I added the test Philip suggested and committed this.

          Show
          Doug Cutting added a comment - I added the test Philip suggested and committed this.

            People

            • Assignee:
              Doug Cutting
              Reporter:
              Philip Zeyliger
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development