Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
2.9.0
-
None
-
None
-
Reviewed
Description
Enable LCE and CGroups
Submit a mapreduce job
2016-02-24 18:56:46,889 INFO org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting absolute path : /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001 2016-02-24 18:56:46,894 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 255. Privileged Execution Operation Output: main : command provided 3 main : run as user is dsperf main : requested yarn user is dsperf failed to rmdir job.jar: Not a directory Error while deleting /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001: 20 (Not a directory) Full command array for failed execution: [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, dsperf, dsperf, 3, /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001] 2016-02-24 18:56:46,894 ERROR org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: DeleteAsUser for /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_000001 returned with exit code: 255 org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=255: at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: ExitCodeException exitCode=255: at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) ... 10 more
As a result nodemanager-local directory are not getting deleted for each application
total 36 drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ -rw------- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ -rwx------ 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/
Attachments
Attachments
Issue Links
- is broken by
-
YARN-4594 container-executor fails to remove directory tree when chmod required
- Resolved