Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
When refining YARN-8714, found that the YARN localizer seems can handle remote directory directly. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. This ability is added by YARN-2185.
For testing purpose, I changed distributedShell's client to let it localize an HDFS directory "mydir" directly.
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_000001/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
But the YARN native service seems doesn't know this YARN localizer ability and blocked it.
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported.
We should enable this ability in yarn native service.