Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.3.0, 3.0.0-alpha1
-
None
-
None
Description
I discovered the problem when running unit test TestMRJobClient on Windows. The cause is indirect in this case. In the unit test, we try to launch a job and list its status. The job failed, and caused the list command get a result of 0, which triggered the unit test assert. From the log and debug, the job failed because we failed to create the Jar with classpath (see code around FileUtil.createJarWithClassPath) in ContainerLaunch. This is a Windows specific step right now; so the test still passes on Linux. This step failed because we passed in a relative path to FileContext.globStatus() in FileUtil.createJarWithClassPath. The relevant log looks like the following.
2013-08-12 16:12:05,937 WARN [ContainersLauncher #0] launcher.ContainerLaunch (ContainerLaunch.java:call(270)) - Failed to launch container. org.apache.hadoop.HadoopIllegalArgumentException: Path is relative at org.apache.hadoop.fs.Path.checkNotRelative(Path.java:74) at org.apache.hadoop.fs.FileContext.getFSofPath(FileContext.java:304) at org.apache.hadoop.fs.Globber.schemeFromPath(Globber.java:107) at org.apache.hadoop.fs.Globber.glob(Globber.java:128) at org.apache.hadoop.fs.FileContext$Util.globStatus(FileContext.java:1908) at org.apache.hadoop.fs.FileUtil.createJarWithClassPath(FileUtil.java:1247) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.sanitizeEnv(ContainerLaunch.java:679) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:232) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
I think this is a regression from HADOOP-9817. I modified some code and the unit test passed. (See the attached patch.) However, I think the impact is larger. I will add some unit tests to verify the behavior, and work on a more complete fix.
Attachments
Attachments
Issue Links
- relates to
-
HADOOP-9817 FileSystem#globStatus and FileContext#globStatus need to work with symlinks
- Closed
- requires
-
HDFS-5093 TestGlobPaths should re-use the MiniDFSCluster to avoid failure on Windows
- Closed