Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15281

Distcp to add no-rename copy option

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 3.3.0, 3.2.1, 3.1.3
    • Component/s: tools/distcp
    • Labels:
      None

      Description

      Currently Distcp uploads a file by two strategies

      1. append parts
      2. copy to temp then rename

      option 2 executes the following sequence in promoteTmpToTarget

          if ((fs.exists(target) && !fs.delete(target, false))
              || (!fs.exists(target.getParent()) && !fs.mkdirs(target.getParent()))
              || !fs.rename(tmpTarget, target)) {
            throw new IOException("Failed to promote tmp-file:" + tmpTarget
                                    + " to: " + target);
          }
      

      For any object store, that's a lot of HTTP requests; for S3A you are looking at 12+ requests and an O(data) copy call.

      This is not a good upload strategy for any store which manifests its output atomically at the end of the write().

      Proposed: add a switch to write directly to the dest path, which can be supplied as either a conf option (distcp.direct.write = true) or a CLI option (-direct).

        Attachments

        1. HADOOP-15281-001.patch
          21 kB
          Andrew Olson
        2. HADOOP-15281-002.patch
          22 kB
          Andrew Olson
        3. HADOOP-15281-003.patch
          22 kB
          Andrew Olson
        4. HADOOP-15281-004.patch
          22 kB
          Andrew Olson
        5. HADOOP-15281-branch-2-001.patch
          26 kB
          Steve Loughran
        6. HADOOP-15281-branch-2-002.patch
          26 kB
          Andrew Olson

          Issue Links

            Activity

              People

              • Assignee:
                noslowerdna Andrew Olson
                Reporter:
                stevel@apache.org Steve Loughran
              • Votes:
                1 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: