Avro
  1. Avro
  2. AVRO-1240

SpecificDataumReader.next() can return a GenericData.Record resulting in a ClassCastException

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.3
    • Fix Version/s: 1.7.4
    • Component/s: java
    • Labels:
      None

      Description

      The following reasonable code:

          DatumReader<Test> reader = new SpecificDatumReader<Test>(Test.class); 
          DataFileReader<Test> dataFileReader = new DataFileReader<Test>(file, reader); 
          Test datum = null;
          while(dataFileReader.hasNext()) { 
            datum = dataFileReader.next(datum); 
            System.out.println(datum); 
          }
      

      Can result in a runtime exception:

      $ java -cp avro-1.7.3.jar:$(hadoop classpath) org.apache.hadoop.util.RunJar target/avro-example-1.0-jar-with-dependencies.jar Sample target/test-file 
      Exception in thread "main" java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to Test
      	at Sample.main(Sample.java:27)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
      

      when the jar containing the specific class (Test in my example) is loaded from a different classloader than the avro jar itself. This occurs when run via the hadoop jar command but could occur in web containers or other locations as well.

      1. AVRO-1240.patch
        3 kB
        Doug Cutting

        Issue Links

          Activity

          Hide
          Brock Noland added a comment -

          I have found three work arounds for this issue.

          Change

              Test datum = null;
          

          to

              Test datum = new Test();
          

          Change

              DatumReader<Test> reader = new SpecificDatumReader<Test>(Test.class); 
          

          to:

              DatumReader<Test> reader = new SpecificDatumReader<Test>(Test.SCHEMA$, Test.SCHEMA$, new SpecificData(Test.class.getClassLoader())); 
          

          Put the jar containing the specific class in the classpath of the main JVM
          That is change:

            $ java -cp avro-1.7.3.jar:$(hadoop classpath) org.apache.hadoop.util.RunJar target/avro-example-1.0-jar-with-dependencies.jar Sample target/test-file 
          

          to

             $ java -cp avro-1.7.3.jar:$(hadoop classpath):target/avro-example-1.0-jar-with-dependencies.jar org.apache.hadoop.util.RunJar target/avro-example-1.0-jar-with-dependencies.jar Sample target/test-file 
          

          or do the equivalent with the HADOOP_CLASSPATH variable.

          Show
          Brock Noland added a comment - I have found three work arounds for this issue. Change Test datum = null; to Test datum = new Test(); Change DatumReader<Test> reader = new SpecificDatumReader<Test>(Test.class); to: DatumReader<Test> reader = new SpecificDatumReader<Test>(Test.SCHEMA$, Test.SCHEMA$, new SpecificData(Test.class.getClassLoader())); Put the jar containing the specific class in the classpath of the main JVM That is change: $ java -cp avro-1.7.3.jar:$(hadoop classpath) org.apache.hadoop.util.RunJar target/avro-example-1.0-jar-with-dependencies.jar Sample target/test-file to $ java -cp avro-1.7.3.jar:$(hadoop classpath):target/avro-example-1.0-jar-with-dependencies.jar org.apache.hadoop.util.RunJar target/avro-example-1.0-jar-with-dependencies.jar Sample target/test-file or do the equivalent with the HADOOP_CLASSPATH variable.
          Hide
          Doug Cutting added a comment -

          Here's a patch that fixes the SpecificDatumReader(Class) constructor to use that class's ClassLoader.

          Show
          Doug Cutting added a comment - Here's a patch that fixes the SpecificDatumReader(Class) constructor to use that class's ClassLoader.
          Hide
          Brock Noland added a comment -

          Doug,

          Yes that the same approach I was considering and fixes the issue locally (below). I do have one additional item to discuss. The Specific* readers are generics and as such, the user expects that the next() method will never return anything but the Specific class in question. Furthermore, if using the Specific* it's likely a ClassCastException is going to be returned by returning anything but the Specific class in question.

          As such, inline with the fail fast and loudly principle, I wonder if instead of punting up to the Generic*, a runtime exception with a good error message should be thrown when the class for a schema cannot be found. I am not overly familiar with the Avro source, so it's possible this would break something badly. If the change is safe, then at least in this case, it would have made for a much better user experience.

          $ java -cp avro-1.7.4-SNAPSHOT.jar:$(hadoop classpath) org.apache.hadoop.util.RunJar target/avro-example-1.0-jar-with-dependencies.jar Sample target/test-file
          {"left": "l", "right": "r"}
          
          Show
          Brock Noland added a comment - Doug, Yes that the same approach I was considering and fixes the issue locally (below). I do have one additional item to discuss. The Specific* readers are generics and as such, the user expects that the next() method will never return anything but the Specific class in question. Furthermore, if using the Specific* it's likely a ClassCastException is going to be returned by returning anything but the Specific class in question. As such, inline with the fail fast and loudly principle, I wonder if instead of punting up to the Generic*, a runtime exception with a good error message should be thrown when the class for a schema cannot be found. I am not overly familiar with the Avro source, so it's possible this would break something badly. If the change is safe, then at least in this case, it would have made for a much better user experience. $ java -cp avro-1.7.4-SNAPSHOT.jar:$(hadoop classpath) org.apache.hadoop.util.RunJar target/avro-example-1.0-jar-with-dependencies.jar Sample target/test-file {"left": "l", "right": "r"}
          Hide
          Doug Cutting added a comment -

          There are cases where folks nest generic within specific & reflect & vice-versa. One example is Pair<K,V> which can contain reflect, specific or generic instances. There's also a test case in TestReflect called testReflectWithinGeneric. So we can't change that without breaking compatibility.

          Show
          Doug Cutting added a comment - There are cases where folks nest generic within specific & reflect & vice-versa. One example is Pair<K,V> which can contain reflect, specific or generic instances. There's also a test case in TestReflect called testReflectWithinGeneric. So we can't change that without breaking compatibility.
          Hide
          Brock Noland added a comment -

          OK cool. The patch is a +1 from my perspective.

          Show
          Brock Noland added a comment - OK cool. The patch is a +1 from my perspective.
          Hide
          Brock Noland added a comment -

          Assigning to Doug as he wrote the patch attached to this JIRA.

          Show
          Brock Noland added a comment - Assigning to Doug as he wrote the patch attached to this JIRA.
          Hide
          Doug Cutting added a comment -

          I committed this.

          Show
          Doug Cutting added a comment - I committed this.
          Hide
          Hudson added a comment -

          Integrated in AvroJava #343 (See https://builds.apache.org/job/AvroJava/343/)
          AVRO-1240. Java: Fix SpecificDatumReader(Class) constructor to use correct ClassLoader. (Revision 1442612)

          Result = SUCCESS
          cutting :
          Files :

          • /avro/trunk/CHANGES.txt
          • /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java
          • /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectDatumReader.java
          • /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java
          Show
          Hudson added a comment - Integrated in AvroJava #343 (See https://builds.apache.org/job/AvroJava/343/ ) AVRO-1240 . Java: Fix SpecificDatumReader(Class) constructor to use correct ClassLoader. (Revision 1442612) Result = SUCCESS cutting : Files : /avro/trunk/CHANGES.txt /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectDatumReader.java /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java

            People

            • Assignee:
              Doug Cutting
              Reporter:
              Brock Noland
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development