Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7512 Support service upgrade via YARN Service API and CLI
  3. YARN-8194

Exception when reinitializing a container using LinuxContainerExecutor

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 3.2.0, 3.1.1
    • None
    • None

    Description

      When a component instance is upgraded and the container executor is set to
      org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor, then the following exception is seen in the nodemanager:

      Writing to cgroup task files...
      Creating local dirs...
      Can't open /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh for output - File exists
      Getting exit code file...
      Creating script paths...
      
      Full command array for failed execution:
      [/usr/local/hadoop-3.2.0-SNAPSHOT/bin/container-executor, hbase, hbase, 1, application_1524242413029_0001, container_1524242413029_0001_01_000002, /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002, /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh, /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1524242413029_0001/container_1524242413029_0001_01_000002/container_1524242413029_0001_01_000002.tokens, /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1524242413029_0001/container_1524242413029_0001_01_000002/container_1524242413029_0001_01_000002.pid, /tmp/hadoop-yarn/nm-local-dir, /usr/local/hadoop-3.2.0-SNAPSHOT/logs/userlogs, cgroups=none]
      2018-04-20 16:50:16,641 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Launch container failed. Exception:
      org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=33: Could not create copy file 3 /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh
      Could not create local files and directories
      
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:118)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:141)
              at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:562)
              at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:477)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:492)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:304)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:101)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: ExitCodeException exitCode=33: Could not create copy file 3 /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh
      Could not create local files and directories
      
              at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009)
              at org.apache.hadoop.util.Shell.run(Shell.java:902)
              at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
              ... 11 more
      2018-04-20 16:50:16,642 WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code from container container_1524242413029_0001_01_000002 is : 33
      2018-04-20 16:50:16,642 WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exception from container-launch with container ID: container_1524242413029_0001_01_000002 and exit code: 33
      org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:124)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:141)
              at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:562)
              at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:477)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:492)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:304)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:101)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      2018-04-20 16:50:16,643 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch.
      2018-04-20 16:50:16,643 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_1524242413029_0001_01_000002
      2018-04-20 16:50:16,643 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 33
      2018-04-20 16:50:16,643 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception message: Launch container failed
      2018-04-20 16:50:16,643 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Shell error output: Could not create copy file 3 /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh
      

      Attachments

        1. YARN-8194.001.patch
          6 kB
          Chandni Singh

        Issue Links

          Activity

            People

              csingh Chandni Singh
              csingh Chandni Singh
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: