Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8464

Async scheduling thread could be interrupted when there are no NodeManagers in cluster

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0, 3.1.1
    • Component/s: capacity scheduler
    • Labels:
      None
    • Target Version/s:

      Description

      Test scenario:
      1. Make either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs read-only
      2. Restart NMs via Ambari, none of them show up in the RM UI as expected
      3. Revert back the read-only dirs and restart NMs
      4. Include a non-existent dir in either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs (1 good existing dir + 1 non-existing dir)
      5. Restart NMs via Ambari, all NMs show as RUNNING with a Health Report message as expected
      6. Submit a MapReduce sleep job, job goes into ACCEPTED state
      7. Job stays in ACCEPTED state forever even though all NMs are running and have available memory

       

      Credits to Charan Hebri who found this issue.

        Attachments

        1. YARN-8464.002.patch
          3 kB
          Sunil Govindan
        2. YARN-8464.001.patch
          2 kB
          Sunil Govindan

          Activity

            People

            • Assignee:
              sunilg Sunil Govindan
              Reporter:
              charanh Charan Hebri
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: