Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8394

Improve data locality documentation for Capacity Scheduler

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.2.0, 3.1.1, 3.0.4
    • None
    • None
    • Reviewed

    Description

      YARN-6344 introduces a new parameter yarn.scheduler.capacity.rack-locality-additional-delay in capacity-scheduler.xml, we need to add some documentation in CapacityScheduler.md accordingly.

      Moreover, we are seeing more and more clusters are separating storage and computation where file system is always remote, in such cases we need to introduce how to compromise data locality in CS otherwise MR jobs are suffering.

      Attachments

        1. YARN-8394.002.patch
          3 kB
          Weiwei Yang
        2. YARN-8394.001.patch
          2 kB
          Weiwei Yang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            cheersyang Weiwei Yang
            cheersyang Weiwei Yang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment