Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1655

Adding consistency check at jobClose() when committing dynamic partitions

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.7.0
    • None
    • None
    • Reviewed

    Description

      In case of dynamic partition insert, FileSinkOperator generated a directory for a new partition and the files in the directory is named with '_tmp*'. When a task succeed, the file is renamed to remove the "_tmp", which essentially implement the "commit" semantics. A lot of exceptions could happen (process got killed, machine dies etc.) could left the _tmp files exist in the DP directory. These _tmp files should be deleted ("rolled back") at successful jobClose(). After the deletion, we should also delete any empty directories.

      Attachments

        1. HIVE-1655.patch
          1 kB
          Ning Zhang

        Activity

          People

            nzhang Ning Zhang
            nzhang Ning Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: