Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12441

Fix kill command behavior under some Linux distributions.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      After HADOOP-12317, kill command's execution will be failure under Ubuntu12. After NM restarts, it cannot get if a process is alive or not via pid of containers, and it cannot kill process correctly when RM/AM tells NM to kill a container.

      Logs from NM (customized logs):

      2015-09-25 21:58:59,348 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:containerIsAlive(431)) -  ================== check alive cmd:[[Ljava.lang.String;@496e442d]
      2015-09-25 21:58:59,349 INFO  nodemanager.NMAuditLogger (NMAuditLogger.java:logSuccess(89)) - USER=hrt_qa       IP=10.0.1.14    OPERATION=Stop Container Request        TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1443218269460_0001    CONTAINERID=container_1443218269460_0001_01_000001
      2015-09-25 21:58:59,363 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:containerIsAlive(438)) -  ===========================
      ExitCodeException exitCode=1: ERROR: garbage process ID "--".
      Usage:
        kill pid ...              Send SIGTERM to every process listed.
        kill signal pid ...       Send a signal to every process listed.
        kill -s signal pid ...    Send a signal to every process listed.
        kill -l                   List all signal names.
        kill -L                   List all signal names in a nice table.
        kill -l signal            Convert between signal numbers and names.
      
              at org.apache.hadoop.util.Shell.runCommand(Shell.java:550)
              at org.apache.hadoop.util.Shell.run(Shell.java:461)
              at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:727)
              at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.containerIsAlive(DefaultContainerExecutor.java:432)
              at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.signalContainer(DefaultContainerExecutor.java:401)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:419)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
              at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
              at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
              at java.lang.Thread.run(Thread.java:745)
      

        Attachments

        1. HADOOP-12441.1.patch
          7 kB
          Wangda Tan
        2. HADOOP-12441.2.patch
          9 kB
          Wangda Tan
        3. HADOOP-12441.4.patch
          6 kB
          Wangda Tan

          Issue Links

            Activity

              People

              • Assignee:
                leftnoteasy Wangda Tan
                Reporter:
                leftnoteasy Wangda Tan
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: