Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-17742

DistCp: distcp fail when renaming within ftp filesystem

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When I copy file from a ftp filesystem to other ftp filesystem by distCp, there are two problems. Firstly, the tmp target path can't be found in the exception information. After my test, it was caused by the comma. Then when the uri of target path with schema, host, port can't be rename because of the ioexception.

      // exception information
      java.io.FileNotFoundException: Source path ftp://hadoop336.photo.163.org:21/ndi-0111/56598489-3de3-4585-8426-c03435640481/.distcp.tmp.attempt_local344388648_0001_m_000000_0.1622634094600 does not existjava.io.FileNotFoundException: Source path ftp://hadoop336.photo.163.org:21/ndi-0111/56598489-3de3-4585-8426-c03435640481/.distcp.tmp.attempt_local344388648_0001_m_000000_0.1622634094600 does not exist at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:668) at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:626) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.promoteTmpToTarget(RetriableFileCopyCommand.java:220) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:155) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:115) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
      
      

      When I update the tempFileName from '.distcp.tmp.XXX' in org.apache.hadoop.tools.mapred.RetriableFileCopyCommand#getTempFile to 'distcp.tmp.XXX', the question one is gone.

      The targetPath can be renamed when only the path of uri exists in org.apache.hadoop.tools.mapred.RetriableFileCopyCommand#doCopy.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jerryzhao1423 jerry Assign to me
            jerryzhao1423 jerry

            Dates

              Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 5h 10m
              5h 10m

              Slack

                Issue deployment