Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14407

DistCp - Introduce a configurable copy buffer size

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.9.0
    • Fix Version/s: 2.9.0, 3.0.0-alpha4
    • Component/s: tools/distcp
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      The copy buffer size can be configured via the new parameter <copybuffersize>. By default the <copybuffersize> is set to 8KB.

      Description

      Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 8KB. We have noticed in our performance tests that with bigger buffer sizes we saw upto ~3x performance boost. Hence, making the copy buffer size a configurable setting via the new parameter <copybuffersize>.

        Attachments

        1. HADOOP-14407.001.patch
          11 kB
          Omkar Aradhya K S
        2. HADOOP-14407.002.patch
          11 kB
          Yongjun Zhang
        3. HADOOP-14407.002.patch
          11 kB
          Omkar Aradhya K S
        4. HADOOP-14407.003.patch
          12 kB
          Omkar Aradhya K S
        5. HADOOP-14407.004.branch2.patch
          12 kB
          Omkar Aradhya K S
        6. HADOOP-14407.004.patch
          12 kB
          Omkar Aradhya K S
        7. HADOOP-14407.004.patch
          12 kB
          Omkar Aradhya K S
        8. HADOOP-14407.branch2.002.patch
          13 kB
          Yongjun Zhang
        9. TotalTime-vs-CopyBufferSize.jpg
          40 kB
          Omkar Aradhya K S

          Activity

            People

            • Assignee:
              omkarksa Omkar Aradhya K S
              Reporter:
              omkarksa Omkar Aradhya K S
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: