Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3339

Job is getting hanged indefinitely,if the child processes are killed on the NM. KILL_CONTAINER eventtype is continuosly sent to the containers that are not existing

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.23.0
    • 0.23.1
    • mrv2
    • None
    • Reviewed
    • Fixed MR AM to stop considering node blacklisting after the number of nodes blacklisted crosses a threshold.

    Description

      I have only one NM running.
      I have submitted a job and all the child processes on the NM got killed continuosly.This made the Job to hang indefinitely.

      In the NM logs it is logging WARN message :org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1320301910500_0004_01_001359

      Attachments

        1. MAPREDUCE-3339-20111220.txt
          28 kB
          Vinod Kumar Vavilapalli
        2. MR3339_v1.txt
          18 kB
          Siddharth Seth
        3. MR3339_v2.txt
          27 kB
          Siddharth Seth

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sseth Siddharth Seth Assign to me
            ramgopalnaali Ramgopal N
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment