Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13169

Randomize file list in SimpleCopyListing

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.8.0
    • tools/distcp
    • None
    • Reviewed

    Description

      When copying files to S3, based on file listing some mappers can get into S3 partition hotspots. This would be more visible, when data is copied from hive warehouse with lots of partitions (e.g date partitions). In such cases, some of the tasks would tend to be a lot more slower than others. It would be good to randomize the file paths which are written out in SimpleCopyListing to avoid this issue.

      Attachments

        1. HADOOP-13169-branch-2-001.patch
          7 kB
          Rajesh Balamohan
        2. HADOOP-13169-branch-2-002.patch
          6 kB
          Rajesh Balamohan
        3. HADOOP-13169-branch-2-003.patch
          8 kB
          Rajesh Balamohan
        4. HADOOP-13169-branch-2-004.patch
          8 kB
          Rajesh Balamohan
        5. HADOOP-13169-branch-2-005.patch
          12 kB
          Rajesh Balamohan
        6. HADOOP-13169-branch-2-006.patch
          14 kB
          Rajesh Balamohan
        7. HADOOP-13169-branch-2-007.patch
          13 kB
          Rajesh Balamohan
        8. HADOOP-13169-branch-2-008.patch
          14 kB
          Rajesh Balamohan
        9. HADOOP-13169-branch-2-009.patch
          15 kB
          Rajesh Balamohan
        10. HADOOP-13169-branch-2-010.patch
          15 kB
          Rajesh Balamohan

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rajesh.balamohan Rajesh Balamohan
            rajesh.balamohan Rajesh Balamohan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment