Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18582

No need to clean tmp files in distcp direct mode

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      it not necessary to do `cleanupTempFiles`  while ditcp commit job in direct  mode, because it there is no temp files in direct mode.

      This clean operation will increase the task execution time, because it will get the list of files in the target path. When the number of files in the target path is very large, this operation will be very slow.

      note there are two patches which need to be cherrypicked when picking this up; the original patch and a followup, both with HADOOP-18582 in the title

      3b7b79b37ae HADOOP-18582. skip unnecessary cleanup logic in distcp (#5251)
      e8a6b2c2c4e HADOOP-18582. Addendum: Skip unnecessary cleanup logic in DistCp.
      

      Attachments

        Activity

          People

            kevin10000 10000kang
            kevin10000 10000kang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: