Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
The jobSubmitDir directory is owned by root and is being cleaned up as the submitting user, which appears to be why it is failing to clean up.
2018-05-21 19:46:15,124 WARN [DeletionService #0] privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(174)) - Shell execution returned exit code: 255. Privileged Execution Operation Stderr: Stdout: main : command provided 3 main : run as user is ebadger main : requested yarn user is ebadger failed to unlink /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_000001/jobSubmitDir/job.split: Permission denied failed to unlink /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_000001/jobSubmitDir/job.splitmetainfo: Permission denied failed to rmdir jobSubmitDir: Directory not empty Error while deleting /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_000001: 39 (Directory not empty) Full command array for failed execution: [/hadoop-3.2.0-SNAPSHOT/bin/container-executor, ebadger, ebadger, 3, /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_000001] 2018-05-21 19:46:15,124 ERROR [DeletionService #0] nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(848)) - DeleteAsUser for /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_000001 returned with exit code: 255 org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=255: at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:206) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:844) at org.apache.hadoop.yarn.server.nodemanager.containermanager.deletion.task.FileDeletionTask.run(FileDeletionTask.java:135) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: ExitCodeException exitCode=255: at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009) at org.apache.hadoop.util.Shell.run(Shell.java:902) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152) ... 10 more
[foo@bar hadoop]$ ls -l /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_000001/ total 4 drwxr-sr-x 2 root users 4096 May 21 19:45 jobSubmitDir
Attachments
Issue Links
- duplicates
-
YARN-7904 Privileged, trusted containers should be supported only in ENTRYPOINT mode
- Resolved