Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7818

Remove privileged operation warnings during container launch for the ContainerRuntimes

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0, 3.1.1
    • Component/s: None
    • Labels:
      None

      Description

      steps:
      1) Run Dshell Application

      yarn  org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/hdp/3.0.0.0-751/hadoop-yarn/hadoop-yarn-applications-distributedshell-*.jar -keep_containers_across_application_attempts -timeout 900000 -shell_command "sleep 110" -num_containers 4

      2) Find out host where AM is running.
      3) Find Containers launched by application
      4) Restart NM where AM is running
      5) Validate that new attempt is not started and containers launched before restart are in RUNNING state.

      In this test, step#5 fails because containers failed to launch with error 143

      2018-01-24 09:48:30,547 INFO  container.ContainerImpl (ContainerImpl.java:handle(2108)) - Container container_e04_1516787230461_0001_01_000003 transitioned from RUNNING to KILLING
      2018-01-24 09:48:30,547 INFO  launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(668)) - Cleaning up container container_e04_1516787230461_0001_01_000003
      2018-01-24 09:48:30,552 WARN  privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(174)) - Shell execution returned exit code: 143. Privileged Execution Operation Stderr:
      
      Stdout: main : command provided 1
      main : run as user is hrt_qa
      main : requested yarn user is hrt_qa
      Getting exit code file...
      Creating script paths...
      Writing pid file...
      Writing to tmp file /grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/container_e04_1516787230461_0001_01_000003.pid.tmp
      Writing to cgroup task files...
      Creating local dirs...
      Launching container...
      Getting exit code file...
      Creating script paths...
      
      Full command array for failed execution:
      [/usr/hdp/3.0.0.0-751/hadoop-yarn/bin/container-executor, hrt_qa, hrt_qa, 1, application_1516787230461_0001, container_e04_1516787230461_0001_01_000003, /grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003, /grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/launch_container.sh, /grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/container_e04_1516787230461_0001_01_000003.tokens, /grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/container_e04_1516787230461_0001_01_000003.pid, /grid/0/hadoop/yarn/local, /grid/0/hadoop/yarn/log, cgroups=none]
      2018-01-24 09:48:30,553 WARN  runtime.DefaultLinuxContainerRuntime (DefaultLinuxContainerRuntime.java:launchContainer(127)) - Launch container failed. Exception:
      org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=143:
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:124)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:152)
              at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:549)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:465)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:285)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:95)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: ExitCodeException exitCode=143:
              at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009)
              at org.apache.hadoop.util.Shell.run(Shell.java:902)
              at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
              ... 10 more
      2018-01-24 09:48:30,553 WARN  nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:launchContainer(557)) - Exit code from container container_e04_1516787230461_0001_01_000003 is : 143
      2018-01-24 09:48:30,582 INFO  containermanager.ContainerManagerImpl (ContainerManagerImpl.java:stopContainerInternal(1365)) - Stopping container with container Id: container_e04_1516787230461_0001_01_000005
      2018-01-24 09:48:31,093 INFO  impl.TimelineV2ClientImpl (TimelineV2ClientImpl.java:setTimelineCollectorInfo(172)) - Updated timeline service address to xxxxxx:40757
      2018-01-24 09:48:32,675 INFO  container.ContainerImpl (ContainerImpl.java:handle(2108)) - Container container_e04_1516787230461_0001_01_000003 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL

        Attachments

        1. YARN-7818.002.patch
          4 kB
          Shane Kumpf
        2. YARN-7818.001.patch
          2 kB
          Shane Kumpf

          Activity

            People

            • Assignee:
              shanekumpf@gmail.com Shane Kumpf
              Reporter:
              yeshavora Yesha Vora
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: