Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8193

YARN RM hangs abruptly (stops allocating resources) when running successive applications.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0, 3.1.1, 2.10.1
    • Component/s: yarn
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When running massive queries successively, at some point RM just hangs and stops allocating resources. At the point RM get hangs, YARN throw NullPointerException  at RegularContainerAllocator.getLocalityWaitFactor.

      There's sufficient space given to yarn.nodemanager.local-dirs (not a node health issue, RM didn't report any node being unhealthy). There is no fixed trigger for this (query or operation).

      This problem goes away on restarting ResourceManager. No NM restart is required. 

       

       

        Attachments

        1. YARN-8193-branch-2.10-001.patch
          5 kB
          Jonathan Hung
        2. YARN-8193-branch-2-001.patch
          5 kB
          Jason Darrell Lowe
        3. YARN-8193-branch-2.9.0-001.patch
          5 kB
          Juanjuan Tian
        4. YARN-8193.002.patch
          5 kB
          Zian Chen
        5. YARN-8193.001.patch
          5 kB
          Zian Chen

          Issue Links

            Activity

              People

              • Assignee:
                Zian Chen Zian Chen
                Reporter:
                Zian Chen Zian Chen
              • Votes:
                0 Vote for this issue
                Watchers:
                17 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: