Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8464

Async scheduling thread could be interrupted when there are no NodeManagers in cluster

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 3.2.0, 3.1.1
    • capacity scheduler
    • None

    Description

      Test scenario:
      1. Make either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs read-only
      2. Restart NMs via Ambari, none of them show up in the RM UI as expected
      3. Revert back the read-only dirs and restart NMs
      4. Include a non-existent dir in either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs (1 good existing dir + 1 non-existing dir)
      5. Restart NMs via Ambari, all NMs show as RUNNING with a Health Report message as expected
      6. Submit a MapReduce sleep job, job goes into ACCEPTED state
      7. Job stays in ACCEPTED state forever even though all NMs are running and have available memory

       

      Credits to Charan Hebri who found this issue.

      Attachments

        1. YARN-8464.001.patch
          2 kB
          Sunil G
        2. YARN-8464.002.patch
          3 kB
          Sunil G

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sunilg Sunil G
            charanh Charan Hebri
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment