Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27951

hcatalog dynamic partitioning fails with partition already exist error when exist parent partitions path

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      if a table have multiple partitions (part1=x1, part2=y1), when insert into a new partition(part1=x1, part2=y2) hcatalog FileOutputCommitterContainer throws path already exists error

       

      reproduce:

      create table source(id int, part1 string, part2 string);

      create table target(id int) partitioned by (part1 string, part2 string)

      insert into table source values (1, "x1", "y1"), (2, "x1", "y2");

       

      pig -useHcatalog

      A = load 'source' using org.apache.hive.hcatalog.pig.HCatLoader();
      B = filter A by (part2 == 'y1');

      // following succeeds
      store B into 'target' USING org.apache.hive.hcatalog.pig.HCatStorer();

      //following fails with duplicate publishing error

      C = filter A by (part2 == 'y2');
      store C into 'target' USING org.apache.hive.hcatalog.pig.HCatStorer();

       

       

      ```
      Partition already present with given partition key values : Data already exists in /user/hive/warehouse/target_data/part1=x1, duplicate publish not possible.
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:243)
      at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286)
       
      Caused by: org.apache.hive.hcatalog.common.HCatException : 2002 : Partition already present with given partition key values : Data already exists in /user/hive/warehouse/target_data/part1=x1, duplicate publish not possible.
      at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.moveTaskOutputs(FileOutputCommitterContainer.java:564)
      at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.registerPartitions(FileOutputCommitterContainer.java:949)
      at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:273)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:241)
      ```

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            yigress Yi Zhang Assign to me
            yigress Yi Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment