Avro
  1. Avro
  2. AVRO-780

union handling in ReflectDatumWriter

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.1
    • Component/s: java
    • Labels:
      None
    • Environment:

      Mac with VMWare running Linux training-vm 2.6.28-19-server #61-Ubuntu

    • Hadoop Flags:
      Reviewed

      Description

      Our avdl schema definition for the record DeviceRow has a field:

      union

      {array<DynamicColumn4Games>, null}

      Games__;

      When we migrated our MR jobs from 1.4.0 to 1.5.0, we got following messages:

      ===================================================================================================
      11/03/10 11:31:02 INFO mapred.TaskInProgress: Error from attempt_20110310113041953_0001_m_000000_0: java.lang.NullPointerException: in com.ngmoco.hbase.DeviceRow in union null of union in field Games__ of com.ngmoco.hbase.DeviceRow
      at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
      at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
      at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:131)
      at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:114)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900)
      at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
      at org.apache.avro.mapred.HadoopMapper$MapCollector.collect(HadoopMapper.java:69)
      at com.ngmoco.ngpipes.sourcing.NgActivityGatheringMapper.map(NgActivityGatheringMapper.java:91)
      at com.ngmoco.ngpipes.sourcing.NgActivityGatheringMapper.map(NgActivityGatheringMapper.java:1)
      at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:80)
      at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:34)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
      at org.apache.hadoop.mapred.Child.main(Child.java:170)
      Caused by: java.lang.NullPointerException: in union null of union in field Games__ of com.ngmoco.hbase.DeviceRow
      at org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:92)
      at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:86)
      at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:102)
      ... 14 more
      ===================================================================================================

      After we change definition of the field Games__ to:

      union

      {null, array<DynamicColumn4Games>}

      Games__;

      The system stop complaining.

      1. AVRO-780.patch
        2 kB
        Doug Cutting
      2. AVRO-780.patch
        1 kB
        Doug Cutting
      3. AVRO-780.patch
        0.7 kB
        Doug Cutting

        Activity

        Hide
        Doug Cutting added a comment -

        I committed this.

        Show
        Doug Cutting added a comment - I committed this.
        Hide
        Doug Cutting added a comment -

        Here's a version of the patch that includes a test.

        Show
        Doug Cutting added a comment - Here's a version of the patch that includes a test.
        Hide
        Doug Cutting added a comment -

        Re-opening. We generally don't resolve an issue as fixed until the patch has been committed. If this patch fixes the problem for you, please just add a comment saying that. I'd still like to add a unit test that triggers this problem before committing the patch.

        Show
        Doug Cutting added a comment - Re-opening. We generally don't resolve an issue as fixed until the patch has been committed. If this patch fixes the problem for you, please just add a comment saying that. I'd still like to add a unit test that triggers this problem before committing the patch.
        Hide
        Doug Cutting added a comment -

        Thanks! That stack trace shows the problem. Here's a patch that should fix it.

        Show
        Doug Cutting added a comment - Thanks! That stack trace shows the problem. Here's a patch that should fix it.
        Hide
        ey-chih chow added a comment -

        I ran it again with the path. The stack trace is as follows:

        =========================================================================================
        11/03/11 13:51:28 INFO mapred.TaskInProgress: Error from attempt_20110311135107907_0001_m_000000_0: java.lang.NullPointerException: in com.ngmoco.hbase.DeviceRow in union null of union in field Games__ of com.ngmoco.hbase.DeviceRow
        at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:105)
        at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
        at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:131)
        at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:114)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900)
        at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
        at org.apache.avro.mapred.HadoopMapper$MapCollector.collect(HadoopMapper.java:69)
        at com.ngmoco.ngpipes.sourcing.NgActivityGatheringMapper.map(NgActivityGatheringMapper.java:91)
        at com.ngmoco.ngpipes.sourcing.NgActivityGatheringMapper.map(NgActivityGatheringMapper.java:1)
        at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:80)
        at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:34)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
        Caused by: java.lang.NullPointerException
        at org.apache.avro.reflect.ReflectData.isArray(ReflectData.java:109)
        at org.apache.avro.generic.GenericData.instanceOf(GenericData.java:497)
        at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:478)
        at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:70)
        at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:102)
        at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
        at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
        at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:102)
        ... 14 more

        =============================================================================================

        Ey-Chih

        Show
        ey-chih chow added a comment - I ran it again with the path. The stack trace is as follows: ========================================================================================= 11/03/11 13:51:28 INFO mapred.TaskInProgress: Error from attempt_20110311135107907_0001_m_000000_0: java.lang.NullPointerException: in com.ngmoco.hbase.DeviceRow in union null of union in field Games__ of com.ngmoco.hbase.DeviceRow at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:105) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57) at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:131) at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:114) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900) at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466) at org.apache.avro.mapred.HadoopMapper$MapCollector.collect(HadoopMapper.java:69) at com.ngmoco.ngpipes.sourcing.NgActivityGatheringMapper.map(NgActivityGatheringMapper.java:91) at com.ngmoco.ngpipes.sourcing.NgActivityGatheringMapper.map(NgActivityGatheringMapper.java:1) at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:80) at org.apache.avro.mapred.HadoopMapper.map(HadoopMapper.java:34) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.NullPointerException at org.apache.avro.reflect.ReflectData.isArray(ReflectData.java:109) at org.apache.avro.generic.GenericData.instanceOf(GenericData.java:497) at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:478) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:70) at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:102) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65) at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:102) ... 14 more ============================================================================================= Ey-Chih
        Hide
        Doug Cutting added a comment -

        It's hard to diagnose this, as the stack trace unfortunately does not show where the original NullPointerException ocurred. Can you please try again with the attached patch, which should provide a more informative stack trace? Thanks!

        Show
        Doug Cutting added a comment - It's hard to diagnose this, as the stack trace unfortunately does not show where the original NullPointerException ocurred. Can you please try again with the attached patch, which should provide a more informative stack trace? Thanks!

          People

          • Assignee:
            Doug Cutting
            Reporter:
            ey-chih chow
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 72h
              72h
              Remaining:
              Remaining Estimate - 72h
              72h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development