Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5241

FairScheduler repeat container completed

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.5.0, 2.6.1, 2.8.0, 2.7.2
    • None
    • fairscheduler
    • Patch, Important

    Description

      NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate operate may cause repeat container completed, it can lead something wrong.

      Node releaseContainer can pervent repeat release operate:
      like:
      public synchronized void releaseContainer(Container container) {
      if (!isValidContainer(container.getId()))

      { LOG.error("Invalid container released " + container); return; }

      FSAppAttempt containerCompleted did not prevent repeat container completed operate.

      Detail logs at attach file.

      Attachments

        1. repeatContainerCompleted.log
          2 kB
          ChenFolin
        2. YARN-5241.004.patch
          3 kB
          Daniel Templeton
        3. YARN-5241-001.patch
          3 kB
          ChenFolin
        4. YARN-5241-002.patch
          3 kB
          ChenFolin
        5. YARN-5241-003.patch
          3 kB
          ChenFolin

        Activity

          People

            Unassigned Unassigned
            chenfolin ChenFolin
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: