Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.10.1
-
None
-
None
Description
When the job is submitted using run-job.sh the package file is given to YARN. The job is the accepted, the container is created, the package is unpacked and is ready to execute.
However, the startContainer method (ContainerUtil:159) then tries to access the original package file.
try { fileStatus = packagePath.getFileSystem(yarnConfiguration).getFileStatus(packagePath); } catch (IOException ioe) { log.error("IO Exception when accessing the package status from the filesystem", ioe); throw new SamzaException("IO Exception when accessing the package status from the filesystem"); }
It wants to do it just to set the length of the file and the modification time to the resource:
packageResource.setSize(fileStatus.getLen()); packageResource.setTimestamp(fileStatus.getModificationTime());
If these attributes (length and timestamp) are really needed then I think they could be captured and submitted by run-job.sh which would allow to avoid this issue.