Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8346

Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1.0, 3.0.2
    • Fix Version/s: 3.1.0, 2.10.0, 3.2.0, 2.9.2, 3.0.3
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the running containers are killed and second attempt is launched for that application. The diagnostics message is "Opportunistic container queue is full" which is the reason for container killed.

      In NM log, I see below logs for after container is recovered.

      2018-05-23 17:18:50,655 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: Opportunistic container [container_e06_1527075664705_0001_01_000001] will not be queued at the NMsince max queue length [0] has been reached
      

      Following steps are executed for rolling upgrade

      1. Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
      2. Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
      3. Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jlowe Jason Lowe
                Reporter:
                rohithsharma Rohith Sharma K S
              • Votes:
                0 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: