Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8671

Container Launch failed stating "TaskAttempt killed because it ran on unusable node , Container released on a *lost* node"

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.1.1
    • Fix Version/s: None
    • Component/s: yarn
    • Labels:
      None

      Description

      Pre-requisites:

      1. Install HA cluster.
      2.Set yarn.nodemanager.opportunistic-containers-max-queue-length=(positive integer value)[NodeManager->yarnsite.xml]
      3. Set yarn.resourcemanager.opportunistic-container-allocation.enabled= true[ResourceManager->yarnsite.xml]
      

       

      Steps to reproduce:

      1.Keep All NodeManagers Up
      2.Stop 2 Nodemanagers and immediately follow step 3.
      3.Submit a job with -Dmapreduce.job.num-opportunistic-maps-percent="100" and run with 50 mappers  
      

      Expected Result:

      Job should be successfull 

      Actual Result:

      Job is getting successfull but some containers are failing stating 
      
      TaskAttempt killed because it ran on unusable node , Container released on a *lost* node"
      

       

      Log Details:

      TaskAttempt killed because it ran on unusable node Container released on a *lost* node Container launch failed for container_1534149133116_0019_01_000006 : java.net.ConnectException: Call From hostname/IP to hostname:portNumber failed on connection exception: java.net.ConnectException: 
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              akki261001 Akshay Agarwal
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: