Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13169

Randomize file list in SimpleCopyListing

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0
    • Component/s: tools/distcp
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When copying files to S3, based on file listing some mappers can get into S3 partition hotspots. This would be more visible, when data is copied from hive warehouse with lots of partitions (e.g date partitions). In such cases, some of the tasks would tend to be a lot more slower than others. It would be good to randomize the file paths which are written out in SimpleCopyListing to avoid this issue.

        Attachments

        1. HADOOP-13169-branch-2-001.patch
          7 kB
          Rajesh Balamohan
        2. HADOOP-13169-branch-2-002.patch
          6 kB
          Rajesh Balamohan
        3. HADOOP-13169-branch-2-003.patch
          8 kB
          Rajesh Balamohan
        4. HADOOP-13169-branch-2-004.patch
          8 kB
          Rajesh Balamohan
        5. HADOOP-13169-branch-2-005.patch
          12 kB
          Rajesh Balamohan
        6. HADOOP-13169-branch-2-006.patch
          14 kB
          Rajesh Balamohan
        7. HADOOP-13169-branch-2-007.patch
          13 kB
          Rajesh Balamohan
        8. HADOOP-13169-branch-2-008.patch
          14 kB
          Rajesh Balamohan
        9. HADOOP-13169-branch-2-009.patch
          15 kB
          Rajesh Balamohan
        10. HADOOP-13169-branch-2-010.patch
          15 kB
          Rajesh Balamohan

          Issue Links

            Activity

              People

              • Assignee:
                rajesh.balamohan Rajesh Balamohan
                Reporter:
                rajesh.balamohan Rajesh Balamohan
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: