Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17963

Fix for HIVE-17113 can be improved for non-blobstore filesystems

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0
    • None
    • None

    Description

      HIVE-17113/HIVE-17813 fix the duplicate file issue by performing file moves on a file-by-file basis. For non-blobstore filesystems this results in many more filesystem/namenode operations compared to the previous Utilities.mvFileToFinalPath() behavior (dedup files in src dir, rename src dir to final dir).
      For non-blobstore filesystems, a better solution would be the one described here:

      1) Move the temp directory to a new directory name, to prevent additional files from being added by any runaway processes.
      2) Run removeTempOrDuplicateFiles() on this renamed temp directory
      3) Run renameOrMoveFiles() to move the renamed temp directory to the final location.

      This results in only one additional file operation in non-blobstore FSes compared to the original Utilities.mvFileToFinalPath() behavior.

      The proposal is to do away with the config setting hive.exec.move.files.from.source.dir and always have behavior that should take care of the duplicate file issue described in HIVE-17113. For non-blobstore filesystems we will do steps 1-3 described above. For blobstore filesystems we will do the solution done in HIVE-17113/HIVE-17813 which does the file-by-file copy - this should have the same number of file operations as doing a rename directory on blobstore, which effectively results in file moves on a file-by-file basis.

      Attachments

        1. HIVE-17963.1.patch
          7 kB
          Jason Dere
        2. HIVE-17963.2.patch
          7 kB
          Jason Dere

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jdere Jason Dere Assign to me
            jdere Jason Dere
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment