Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10980

Merge of dynamic partitions loads all data to default partition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.14.0
    • 2.0.0
    • Hive
    • None
    • HDP 2.2.4 (also reproduced on apache hive built from trunk)

    Description

      Conditions that lead to the issue:
      1. Execution engine set to MapReduce
      2. Partition columns have different types
      3. Both static and dynamic partitions are used in the query
      4. Dynamically generated partitions require merge

      Result: Final data is loaded to "_HIVE_DEFAULT_PARTITION_".

      Steps to reproduce:
      set hive.exec.dynamic.partition=true;
      set hive.exec.dynamic.partition.mode=strict;
      set hive.optimize.sort.dynamic.partition=false;
      set hive.merge.mapfiles=true;
      set hive.merge.mapredfiles=true;
      set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
      set hive.execution.engine=mr;

      create external table sdp (
      dataint bigint,
      hour int,
      req string,
      cid string,
      caid string
      )
      row format delimited
      fields terminated by ',';

      load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
      load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
      ...
      load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;

      create table tdp (cid string, caid string)
      partitioned by (dataint bigint, hour int, req string);

      insert overwrite table tdp partition (dataint=20150316, hour=16, req)
      select cid, caid, req from sdp where dataint=20150316 and hour=16;

      select * from tdp order by caid;
      show partitions tdp;

      Example of the input file:
      20150316,16,reqA,clusterIdA,cacheId1
      20150316,16,reqB,clusterIdB,cacheId2
      20150316,16,reqA,clusterIdC,cacheId3
      20150316,16,reqD,clusterIdD,cacheId4
      20150316,16,reqA,clusterIdA,cacheId5

      Actual result:
      clusterIdA cacheId1 20150316 16 _HIVE_DEFAULT_PARTITION_
      clusterIdA cacheId1 20150316 16 _HIVE_DEFAULT_PARTITION_
      clusterIdB cacheId2 20150316 16 _HIVE_DEFAULT_PARTITION_
      clusterIdC cacheId3 20150316 16 _HIVE_DEFAULT_PARTITION_
      clusterIdD cacheId4 20150316 16 _HIVE_DEFAULT_PARTITION_
      clusterIdA cacheId5 20150316 16 _HIVE_DEFAULT_PARTITION_
      clusterIdD cacheId8 20150316 16 _HIVE_DEFAULT_PARTITION_
      clusterIdB cacheId9 20150316 16 _HIVE_DEFAULT_PARTITION_
      dataint=20150316/hour=16/req=_HIVE_DEFAULT_PARTITION_

      Attachments

        1. HIVE-10980.patch
          28 kB
          Illya Yalovyy

        Activity

          People

            yalovyyi Illya Yalovyy
            yalovyyi Illya Yalovyy
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: