Avro
  1. Avro
  2. AVRO-793

A strange problem when I am trying to read avro record with a subset of the schema.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.1
    • Component/s: java
    • Environment:

      Avro1.5,Windows xp/Ubuntu 10.0.4

    • Hadoop Flags:
      Reviewed

      Description

      Hi, all. When I am trying to read avro file with a subset of that schema(because I do not need all the details).I meet a strange problem.
      1.I write data using this schema:
      {
      "name": "relation",
      "type": "record",
      "fields": [

      { "name": "timestamp", "type": "long" }

      ,
      {
      "name": "type",
      "type": {
      "type": "map",
      "values":{
      "type" : "array",
      "items": {
      "type":"record",
      "name":"sdf",
      "fields": [

      { "name": "device", "type": "string" }

      ,
      {
      "name": "children",
      "type":

      { "type": "array", "items": "string" }

      }
      ]
      }
      }
      }
      }
      ]
      }

      2.Here is a JSONObject for that schema.
      {
      "timestamp":1234567890,
      "type":{
      "WMA":[

      { "device":"WMA1", "children":["WMB1","WMB2"] }

      ,

      { "device":"WMA2", "children":["WMB1","WMB2"] }

      ]
      }

      }

      3.I write that record succefully.And it is okay if I use this schema for reading:
      {
      "name": "relation",
      "type": "record",
      "fields": [

      { "name": "timestamp", "type": "long" }

      ,
      {
      "name": "type",
      "type": {
      "type": "map",
      "values":{
      "type" : "array",
      "items": {
      "type":"record",
      "name":"sdf",
      "fields": [
      {
      "name": "children",
      "type":

      { "type": "array", "items": "string" }

      }
      ]
      }
      }
      }
      }
      ]
      }

      the result is :
      {
      "timestamp":1234567890,
      "type":{
      "WMA":[

      { "children":["WMB1","WMB2"] }

      ,

      { "children":["WMB1","WMB2"] }

      ]
      }

      }

      4.But if i want to igonre the "children" part instead of "device", I use this schema for reading:
      {
      "name": "relation",
      "type": "record",
      "fields": [

      { "name": "timestamp", "type": "long" }

      ,
      {
      "name": "type",
      "type": {
      "type": "map",
      "values":{
      "type" : "array",
      "items": {
      "type":"record",
      "name":"sdf",
      "fields": [

      { "name": "device", "type": "string" }

      ]
      }
      }
      }
      }
      ]
      }

      Unfortunately,I get exception:

      java.lang.ArrayIndexOutOfBoundsException: -8
      cause:java.lang.ArrayIndexOutOfBoundsException
      at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122)
      at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262)
      at org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:113)
      at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60)
      at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
      at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83)
      at org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:195)
      at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70)
      at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
      at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93)
      at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226)
      at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
      at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
      at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:162)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
      at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:196)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140)
      at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:233)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
      at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:167)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
      at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236)
      at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223)
      at AvroUtilTest.read(AvroUtilTest.java:77)
      at AvroUtilTest.main(AvroUtilTest.java:61)

      As Scott Carey said,I did like this and it worked.How to fix this bug?
      Scott Carey:
      2: If you change the schema you write with by making reversing the order of the fields of "sdf" (array, then string), are the results the same?

      1. AVRO-793.patch
        0.5 kB
        Thiruvalluvan M. G.
      2. AVRO-793-test.patch
        1 kB
        Thiruvalluvan M. G.

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Thiruvalluvan M. G.
            Reporter:
            Yingzhong Xu
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development