Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5850

Multiple table join error for avro

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.11.0
    • None
    • None
    • None

    Description

      Reproduce step:

      -- Create table Part.
      CREATE EXTERNAL TABLE part
      ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
      STORED AS
      INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
      OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
      LOCATION 'hdfs://<hostname>/user/hadoop/tpc-h/data/part'
      TBLPROPERTIES ('avro.schema.url'='hdfs://<hostname>/user/hadoop/tpc-h/schema/part.avsc');
      
      -- Create table Part Supplier.
      CREATE EXTERNAL TABLE partsupp
      ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
      STORED AS
      INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
      OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
      LOCATION 'hdfs://<hostname>/user/hadoop/tpc-h/data/partsupp'
      TBLPROPERTIES ('avro.schema.url'='hdfs://<hostname>/user/hadoop/tpc-h/schema/partsupp.avsc');
      --- Query
      select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where p.p_partkey=1;
      
      Error message is:
      Error: java.io.IOException: java.io.IOException: org.apache.avro.AvroTypeException: Found {
        "type" : "record",
        "name" : "partsupp",
        "namespace" : "com.gs.sdst.pl.avro.tpch",
        "fields" : [ {
          "name" : "ps_partkey",
          "type" : "long"
        }, {
          "name" : "ps_suppkey",
          "type" : "long"
        }, {
          "name" : "ps_availqty",
          "type" : "long"
        }, {
          "name" : "ps_supplycost",
          "type" : "double"
        }, {
          "name" : "ps_comment",
          "type" : "string"
        }, {
          "name" : "systimestamp",
          "type" : "long"
        } ]
      }, expecting {
        "type" : "record",
        "name" : "part",
        "namespace" : "com.gs.sdst.pl.avro.tpch",
        "fields" : [ {
          "name" : "p_partkey",
          "type" : "long"
        }, {
          "name" : "p_name",
          "type" : "string"
        }, {
          "name" : "p_mfgr",
          "type" : "string"
        }, {
          "name" : "p_brand",
          "type" : "string"
        }, {
          "name" : "p_type",
          "type" : "string"
        }, {
          "name" : "p_size",
          "type" : "int"
        }, {
          "name" : "p_container",
          "type" : "string"
        }, {
          "name" : "p_retailprice",
          "type" : "double"
        }, {
          "name" : "p_comment",
          "type" : "string"
        }, {
          "name" : "systimestamp",
          "type" : "long"
        } ]
      }
              at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
              at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
              at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
              at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
              at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
              at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
              at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
              at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
              at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
              at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
      

      Attachments

        1. schema.tar.gz
          0.4 kB
          Shengjun Xin
        2. partsupp.tar.gz
          5.02 MB
          Shengjun Xin
        3. part.tar.gz
          4.12 MB
          Shengjun Xin

        Activity

          People

            Unassigned Unassigned
            xinshengjun Shengjun Xin
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: