1. Avro
  2. AVRO-669

Avro Mapreduce Doesn't Work With Reflect Schemas


    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: java
    • Labels:


      I'm trying to get the Avro trunk code (from Subversion) to work with a simple example of a reflection-defined schema, using a class I created. I use a ReflectDatumWriter to write a set of records to a file, e.g.,

      DatumWriter writer = new ReflectDatumWriter(Record.class);
      DataFileWriter file = new DataFileWriter(writer);

      However, when I try to read that data in using an AvroMapper it fails with an exception as shown below. It turns out that the mapreduce implementation hard-codes a dependence on SpecificDatum readers and writers.

      I've tested switching to use ReflectDatum instead in five places to try to get it to work for an end-to-end reflect data example:
      AvroSerialization (getDeserializer and getSerializer)

      However, switching to use reflection for AvroKeyComparator doesn't work:
      at org.apache.avro.reflect.ReflectData.compare(ReflectData.java:427)
      at org.apache.avro.mapred.AvroKeyComparator.compare(AvroKeyComparator.java:46)

      It should be possible to implement compare on reflect data (just like GenericData's implementation but use the field name instead (or better yet a cached java.lang.reflect.Field)...

      Original exception:
      java.lang.ClassCastException: tba.mr.sample.avro.Record cannot be cast to org.apache.avro.generic.IndexedRecord
      at org.apache.avro.generic.GenericDatumReader.setField(GenericDatumReader.java:152)
      at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:142)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:114)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:105)
      at org.apache.avro.file.DataFileStream.next(DataFileStream.java:198)
      at org.apache.avro.mapred.AvroRecordReader.next(AvroRecordReader.java:63)
      at org.apache.avro.mapred.AvroRecordReader.next(AvroRecordReader.java:33)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
      at org.apache.hadoop.mapred.Child.main(Child.java:170)

      1. AVRO-669.patch
        20 kB
        Doug Cutting
      2. AVRO-669.patch
        19 kB
        Doug Cutting
      3. AVRO-669.patch.2
        5 kB
        Ron Bodkin
      4. AVRO-669.patch
        15 kB
        Doug Cutting

        Issue Links

        There are no Sub-Tasks for this issue.


          Jeff Hammerbacher made changes -
          Link This issue is related to AVRO-638 [ AVRO-638 ]
          Doug Cutting made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Scott Carey made changes -
          Component/s java [ 12312780 ]
          Doug Cutting made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Doug Cutting made changes -
          Attachment AVRO-669.patch [ 12466402 ]
          Doug Cutting made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Doug Cutting made changes -
          Attachment AVRO-669.patch [ 12466268 ]
          Ron Bodkin made changes -
          Attachment AVRO-669.patch.2 [ 12454921 ]
          Doug Cutting made changes -
          Assignee Doug Cutting [ cutting ]
          Doug Cutting made changes -
          Field Original Value New Value
          Attachment AVRO-669.patch [ 12454892 ]
          Ron Bodkin created issue -


            • Assignee:
              Doug Cutting
              Ron Bodkin
            • Votes:
              1 Vote for this issue
              3 Start watching this issue


              • Created: