Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22088

Dynamic partition insert problem on table with "=" in location path spec

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.3.4, 4.0.0
    • None
    • None
    • None
    • Hive 2.6.0.10-2 Executing on Tez.

      OS: Ubuntu 16.04.4 LTS

      Config settings used:

      SET hive.exec.dynamic.partition=true;
      SET hive.exec.dynamic.partition.mode=nonstrict;

    Description

      If external table location spec has a '=' sign (coincidentally partition specifier) in it, then dynamic partition loading fails.

      Use cases:

      Quite often the same data is used in different contexts by creating different external tables on top of the data. Many times the tables have different partition depths depending on how data is organized.

      Like in below example, there are individual customer specific tables and queries/jobs to insert data partitioned by type. And there is another table to give the consolidated data view of all the customers, thus have two level partition customer and type.

      The job to insert customer specific data into customer specific table fails if we use dynamic partitioning. Static partition insert on same table works fine though.

      Replication:

      To replicate following simple setup could be done. Below execution is on 'Tez'.

      Source table-

      CREATE EXTERNAL TABLE temp_dummy_table
      (id STRING, type STRING)
      ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE
      LOCATION '/home/source/';

       

      Destination Table-

      CREATE EXTERNAL TABLE temp_dummy_dest_table
      (id STRING)
      PARTITIONED BY (type string)
      ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE
      LOCATION '/home/destination/customer=abc/';

       

      Insert into destination-

      insert overwrite table temp_dummy_dest_table partition (type)
      select i.id as id, i.type as type
      from temp_dummy_table i
      where i.type in ('type1','type2');

       

      Log and Error Msgs on CLI-

      Loading data to table temp_dummy_dest_table partition (type=null)

      Failed with exception Partition spec {type=type1, customer=abc} contains non-partition columns

      FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

       

      Possible resolution:

      The dynamic partitioning should consider only those partition specs which are under the defined table root/base path. If the path itself has partition style format (customer=abc in above example) then that should not be considered as partition as it is outside the scope of the table.

      Attachments

        1. HIVE-22088.patch.2
          8 kB
          Hui An
        2. HIVE-22088.patch.1
          5 kB
          Hui An

        Activity

          People

            Bone An Hui An
            puneet_k Puneet Khatod
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: