Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4731

container-executor should not follow symlinks in recursive_unlink_children

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.9.0
    • 2.9.0, 3.0.0-alpha1, 2.8.2
    • None
    • None
    • Reviewed

    Description

      Enable LCE and CGroups
      Submit a mapreduce job

      2016-02-24 18:56:46,889 INFO org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting absolute path : /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001
      2016-02-24 18:56:46,894 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 255. Privileged Execution Operation Output:
      main : command provided 3
      main : run as user is dsperf
      main : requested yarn user is dsperf
      failed to rmdir job.jar: Not a directory
      Error while deleting /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001: 20 (Not a directory)
      Full command array for failed execution:
      [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, dsperf, dsperf, 3, /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001]
      2016-02-24 18:56:46,894 ERROR org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: DeleteAsUser for /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001 returned with exit code: 255
      org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=255:
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199)
              at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569)
              at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: ExitCodeException exitCode=255:
              at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
              at org.apache.hadoop.util.Shell.run(Shell.java:838)
              at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
              ... 10 more
      
      

      As a result nodemanager-local directory are not getting deleted for each application

      total 36
      drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./
      drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../
      -rw------- 1 hdfs hadoop  340 Feb 25 08:25 container_tokens
      lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.jar -> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/
      lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.xml -> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml*
      drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/
      -rwx------ 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh*
      drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/
      

      Attachments

        1. YARN-4731.002.patch
          7 kB
          Colin McCabe
        2. YARN-4731.001.patch
          2 kB
          Varun Vasudev

        Issue Links

          Activity

            People

              cmccabe Colin McCabe
              bibinchundatt Bibin Chundatt
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: