Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7086

Release all containers aynchronously



    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: resourcemanager
    • Labels:
    • Target Version/s:


      We have noticed in production two situations that can cause deadlocks and cause scheduling of new containers to come to a halt, especially with regard to applications that have a lot of live containers:

      1. When these applicaitons release these containers in bulk.
      2. When these applications terminate abruptly due to some failure, the scheduler releases all its live containers in a loop.

      To handle the issues mentioned above, we have a patch in production to make sure ALL container releases happen asynchronously - and it has served us well.

      Opening this JIRA to gather feedback on if this is a good idea generally (cc Wangda Tan, Jason Darrell Lowe, Carlo Curino, Karthik Kambatla, Subramaniam Krishnan, Roni Burd)

      BTW, In YARN-6251, we already have an asyncReleaseContainer() in the AbstractYarnScheduler and a corresponding scheduler event, which is currently used specifically for the container-update code paths (where the scheduler realeases temp containers which it creates for the update)


        1. YARN-7086.Perf-test-case.patch
          7 kB
          Manikandan R
        2. YARN-7086.002.patch
          38 kB
          Manikandan R
        3. YARN-7086.001.patch
          22 kB
          Manikandan R



            • Assignee:
              manirajv06@gmail.com Manikandan R
              asuresh Arun Suresh
            • Votes:
              0 Vote for this issue
              12 Start watching this issue


              • Created: