Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4331

Restarting NodeManager leaves orphaned containers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Not A Problem
    • 2.7.1
    • None
    • nodemanager, yarn
    • None

    Description

      We are seeing a lot of orphaned containers running in our production clusters.
      I tried to simulate this locally on my machine and can replicate the issue by killing nodemanager.
      I'm running Yarn 2.7.1 with RM state stored in zookeeper and deploying samza jobs.
      Steps:

      1. Deploy a job
      2. Issue a kill -9 signal to nodemanager
      3. We should see the AM and its container running without nodemanager
      4. AM should die but the container still keeps running
      5. Restarting nodemanager brings up new AM and container but leaves the orphaned container running in the background

      This is effectively causing double processing of data.

      Attachments

        Activity

          People

            Unassigned Unassigned
            josephfrancis Joseph Francis
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: