Hadoop YARN
  1. Hadoop YARN
  2. YARN-73

nodemanager should cleanup running containers when it starts

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.23.3
    • Fix Version/s: None
    • Component/s: nodemanager
    • Labels:
      None

      Description

      Currently the nodemanager doesn't cleanup running containers when it gets restarted. This can cause containers to get lost and stick around forever. We've seen this happen multiple times when the RM is restarted. When the RM is brought back up, it doesn't know about what was running on the cluster, it tells the NMs to reboot and when the NM reboots it loses what it had running. If there are any containers that are behaving badly there is no one left that knows about them to kill them.

      We should kill any running containers when the nodemanager is being started. Note that when the NM is being brought up it needs to somehow figure out what containers were running and be sure it doesn't kill anything it shouldn't.
      Note, we should also try to kill any running containers when the node manager is shutting down (jira 4213 was filed for this).

      This might change a bit when RM restart is implemented if tasks can actually survive across RM/NM being rebooted, but that can be addressed at that point.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          375d 10h 30m 1 Vinod Kumar Vavilapalli 12/May/13 01:10
          Vinod Kumar Vavilapalli made changes -
          Link This issue duplicates YARN-438 [ YARN-438 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue is duplicated by YARN-438 [ YARN-438 ]
          Vinod Kumar Vavilapalli made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue is duplicated by YARN-495 [ YARN-495 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          With YARN-495 in, we changed NM reboot behaviour to be a simple resync - kill all containers and re-register with RM.

          So in sum, YARN-72 cleans up containers on shutdown, YARN-495 does so on resync.

          There is still case when operator issues a shutdown but because NM_SLEEP_DELAY_BEFORE_SIGKILL_MS + NM_PROCESS_KILL_WAIT_MS + SHUTDOWN_CLEANUP_SLOP_MS is not enough to cleanup all containers. We can make the later configurable or can mandate operators to kill containers explicitly in that case.

          Closing this as a duplicate.

          Show
          Vinod Kumar Vavilapalli added a comment - With YARN-495 in, we changed NM reboot behaviour to be a simple resync - kill all containers and re-register with RM. So in sum, YARN-72 cleans up containers on shutdown, YARN-495 does so on resync. There is still case when operator issues a shutdown but because NM_SLEEP_DELAY_BEFORE_SIGKILL_MS + NM_PROCESS_KILL_WAIT_MS + SHUTDOWN_CLEANUP_SLOP_MS is not enough to cleanup all containers. We can make the later configurable or can mandate operators to kill containers explicitly in that case. Closing this as a duplicate.
          Hitesh Shah made changes -
          Link This issue is related to YARN-71 [ YARN-71 ]
          Hitesh Shah made changes -
          Link This issue is duplicated by YARN-438 [ YARN-438 ]
          Hide
          Hitesh Shah added a comment -

          Clarification for previous comment: YARN-72 only covers killing of containers on shutdown.

          Show
          Hitesh Shah added a comment - Clarification for previous comment: YARN-72 only covers killing of containers on shutdown.
          Hide
          Ahmed Radwan added a comment -

          The objective here is to only cover the cleanup during startup? If so, then this issue should be a subtask of YARN-72, as it covers cleaning up containers before shutdown and also during startup.

          Show
          Ahmed Radwan added a comment - The objective here is to only cover the cleanup during startup? If so, then this issue should be a subtask of YARN-72 , as it covers cleaning up containers before shutdown and also during startup.
          Karthik Kambatla (Inactive) made changes -
          Link This issue is related to YARN-72 [ YARN-72 ]
          Hide
          Robert Joseph Evans added a comment -

          Kihwal Lee also mentioned to me that we could do a best effort in a JVM shutdown hook, and then have this as a backup for anything that we were not able to kill, which seems very reasonable.

          Show
          Robert Joseph Evans added a comment - Kihwal Lee also mentioned to me that we could do a best effort in a JVM shutdown hook, and then have this as a backup for anything that we were not able to kill, which seems very reasonable.
          Hide
          Robert Joseph Evans added a comment -

          I don't think there is a built in way to do this on Linux/Unix, but I could be wrong. You also have to take into account the fact that the original process (PID) may spawn off other processes that need to be killed as well (like Streaming). The best way I can think of to do this is to save the PID the process group ID and the user that launched that PID. If the pid is still running then and it is part of the same process group and owned by the same user, shoot it.

          We should also make sure that it is pluggable. This is because once we switch over to using Linux Containers, VMs, or some other form of isolation we can more easily tell which processes are part of that group, and shoot them.

          Show
          Robert Joseph Evans added a comment - I don't think there is a built in way to do this on Linux/Unix, but I could be wrong. You also have to take into account the fact that the original process (PID) may spawn off other processes that need to be killed as well (like Streaming). The best way I can think of to do this is to save the PID the process group ID and the user that launched that PID. If the pid is still running then and it is part of the same process group and owned by the same user, shoot it. We should also make sure that it is pluggable. This is because once we switch over to using Linux Containers, VMs, or some other form of isolation we can more easily tell which processes are part of that group, and shoot them.
          Hide
          Bikas Saha added a comment -

          Pids might get reused. so we need to guard against that.
          Is there any generic Linux OS mechanism that would make the running containers die when the NM dies? e.g. On Windows, NM can use JobObject's to make this happen.

          Show
          Bikas Saha added a comment - Pids might get reused. so we need to guard against that. Is there any generic Linux OS mechanism that would make the running containers die when the NM dies? e.g. On Windows, NM can use JobObject's to make this happen.
          Vinod Kumar Vavilapalli made changes -
          Field Original Value New Value
          Project Hadoop Map/Reduce [ 12310941 ] Hadoop YARN [ 12313722 ]
          Key MAPREDUCE-4214 YARN-73
          Affects Version/s 0.23.3 [ 12322841 ]
          Affects Version/s 0.23.3 [ 12320060 ]
          Target Version/s 0.23.3 [ 12320060 ]
          Component/s nodemanager [ 12319323 ]
          Component/s mrv2 [ 12314301 ]
          Component/s nodemanager [ 12315341 ]
          Thomas Graves created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Thomas Graves
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development