Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-201

CapacityScheduler can take a very long time to schedule containers if requests are off cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.23.3, 2.0.1-alpha
    • 2.0.3-alpha, 0.23.5
    • capacityscheduler
    • None

    Description

      When a user runs a job where one of the input files is a large file on another cluster, the job can create many splits on nodes which are unreachable for computation from the current cluster. The off-switch delay logic in LeafQueue can cause the ResourceManager to allocate containers for the job very slowly. In one case the job was only getting one container every 23 seconds, and the queue had plenty of spare capacity.

      Attachments

        1. YARN-201.patch
          4 kB
          Jason Darrell Lowe
        2. YARN-201.patch
          4 kB
          Jason Darrell Lowe

        Activity

          People

            jlowe Jason Darrell Lowe
            jlowe Jason Darrell Lowe
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: