Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21185

insert overwrite directory ... stored as nontextfile raise exception with merge files open

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.1.1, 2.3.0
    • 3.0.0
    • Query Planning
    • None

    Description

      reproduce:

       

      # init table with small files
      create table multiple_small_files (id int);
      insert into multiple_small_files values (1);
      insert into multiple_small_files values (1);
      insert into multiple_small_files values (1);
      insert into multiple_small_files values (1);
      insert into multiple_small_files values (1);
      insert into multiple_small_files values (1);
      insert into multiple_small_files values (1);
      insert into multiple_small_files values (1);
      
      # open small file merge
      set hive.merge.mapfiles=true;
      set hive.merge.mapredfiles=true;
      
      insert overwrite directory '/path/to/hdfs' stored as avro
      select * from multiple_small_files;
      

      this will produce exception like:

      Messages for this Task:Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$����N���e(���                                                             �$$����N���e(��� at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$����N���e(���                                     �$$����N���e(��� at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160) ... 8 moreCaused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Expecting a AvroGenericRecordWritable at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:139) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:216) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:128) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:92) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:488) ... 9 moreFAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
      

       

      This issue not only affect avrofile format but all nontextfile storage format. The rootcause is hive get wrong input format in file merge stage

      Attachments

        Activity

          People

            Unassigned Unassigned
            lfyzjck chengkun jia
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: