Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3149

Dynamically generated paritions deleted by Block level merge

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.10.0
    • Query Processor
    • None
    • Reviewed

    Description

      When creating partitions in a table using dynamic partitions and a Block level merge is executed at the end of the query, some partitions may be lost. Specifically if the values of two or more dynamic partition keys end in the same sequence of numbers, all but the largest will be dropped.

      I was not able to confirm it, but I suspect that if a map reduce job is speculated as part of the merge, the duplicate data will not be deleted either.

      E.g.
      insert overwrite table merge_dynamic_part partition (ds = '2008-04-08', hr)
      select key, value, if(key % 2 == 0, 'a1', 'b1') as hr from srcpart_merge_dp_rc where ds = '2008-04-08';

      In this query, if a Block level merge is executed at the end, only one of the partitions ds=2008-04-08/hr=a1 and ds=2008-04-08/hr=b1 will appear in the final table.

      Attachments

        1. HIVE-3149.1.patch.txt
          24 kB
          Kevin Wilfong

        Activity

          People

            kevinwilfong Kevin Wilfong
            kevinwilfong Kevin Wilfong
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: