Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8566

Add diagnostic message for unschedulable containers

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0
    • Component/s: resourcemanager
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If a queue is configured with maxResources set to 0 for a resource, and an application is submitted to that queue that requests that resource, that application will remain pending until it is removed or moved to a different queue. This behavior can be realized without extended resources, but it’s unlikely a user will create a queue that allows 0 memory or CPU. As the number of resources in the system increases, this scenario will become more common, and it will become harder to recognize these cases. Therefore, the scheduler should indicate in the diagnostic string for an application if it was not scheduled because of a 0 maxResources setting.

      Example configuration (fair-scheduler.xml) :

      <allocations>
        <queueMaxAppsDefault>100000</queueMaxAppsDefault>
      <queue name="sample_queue">
          <minResources>10000 mb,2vcores</minResources>
          <maxResources>90000 mb,4vcores, 0gpu</maxResources>
          <maxRunningApps>50</maxRunningApps>
          <maxAMShare>-1.0f</maxAMShare>
          <weight>2.0</weight>
          <schedulingPolicy>fair</schedulingPolicy>
        </queue>
      </allocations>
      
      

      Command:

      yarn jar "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000;
      

      The job hangs and the application diagnostic info is empty.
      Given that an exception is thrown before any mapper/reducer container is created, the diagnostic message of the AM should be updated.

        Attachments

        1. YARN-8566.001.patch
          10 kB
          Szilard Nemeth
        2. YARN-8566.002.patch
          10 kB
          Szilard Nemeth
        3. YARN-8566.003.patch
          11 kB
          Szilard Nemeth
        4. YARN-8566.004.patch
          11 kB
          Szilard Nemeth
        5. YARN-8566.005.patch
          14 kB
          Szilard Nemeth
        6. YARN-8566.006.patch
          66 kB
          Szilard Nemeth
        7. YARN-8566.007.patch
          66 kB
          Szilard Nemeth

          Activity

            People

            • Assignee:
              snemeth Szilard Nemeth
              Reporter:
              snemeth Szilard Nemeth
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: