Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.3, 2.0.2-alpha
    • Fix Version/s: 2.0.3-alpha, 0.23.5
    • Component/s: None
    • Labels:
      None

      Description

      We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a few maps running. All these maps were scheduled on nodes which are now in the RM's Lost nodes list. The running maps are in the FAIL_CONTAINER_CLEANUP state

      1. TaskAttemptStateGraph.jpg
        417 kB
        Ravi Prakash
      2. MAPREDUCE-4751-20121108.txt
        12 kB
        Vinod Kumar Vavilapalli
      3. MAPREDUCE-4751-20121109.txt
        22 kB
        Vinod Kumar Vavilapalli
      4. MR-4751-branch-0.23.txt
        22 kB
        Robert Joseph Evans

        Issue Links

          Activity

          Ravi Prakash created issue -
          Ravi Prakash made changes -
          Field Original Value New Value
          Description We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a few maps running. All these maps were scheduled on nodes which are now in the RM's Lost nodes list. We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a few maps running. All these maps were scheduled on nodes which are now in the RM's Lost nodes list. The running maps are in the FAIL_CONTAINER_CLEANUP state
          Ravi Prakash made changes -
          Attachment TaskAttemptStateGraph.jpg [ 12550521 ]
          Vinod Kumar Vavilapalli made changes -
          Assignee Vinod Kumar Vavilapalli [ vinodkv ]
          Ravi Prakash made changes -
          Summary AM stuck in KILL_WAIT for days when node is lost in the middle AM stuck in KILL_WAIT for days
          Vinod Kumar Vavilapalli made changes -
          Project Hadoop YARN [ 12313722 ] Hadoop Map/Reduce [ 12310941 ]
          Key YARN-167 MAPREDUCE-4751
          Affects Version/s 2.0.2-alpha [ 12322471 ]
          Affects Version/s 0.23.3 [ 12320060 ]
          Affects Version/s 0.23.3 [ 12322841 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue is blocked by MAPREDUCE-4748 [ MAPREDUCE-4748 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue duplicates MAPREDUCE-4744 [ MAPREDUCE-4744 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue duplicates MAPREDUCE-4745 [ MAPREDUCE-4745 ]
          Vinod Kumar Vavilapalli made changes -
          Attachment MAPREDUCE-4751-20121108.txt [ 12552750 ]
          Vinod Kumar Vavilapalli made changes -
          Attachment MAPREDUCE-4751-20121109.txt [ 12552951 ]
          Vinod Kumar Vavilapalli made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Robert Joseph Evans made changes -
          Attachment MR-4751-branch-0.23.txt [ 12553129 ]
          Robert Joseph Evans made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 3.0.0 [ 12320355 ]
          Fix Version/s 2.0.3-alpha [ 12323275 ]
          Fix Version/s 0.23.5 [ 12323312 ]
          Resolution Fixed [ 1 ]
          Thomas Graves made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Allen Wittenauer made changes -
          Fix Version/s 3.0.0 [ 12320355 ]

            People

            • Assignee:
              Vinod Kumar Vavilapalli
              Reporter:
              Ravi Prakash
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development