[YARN-9525] IFile format is not working against s3a remote folder - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.2
Fix Version/s: 3.3.0
Component/s: log-aggregation
Labels:
None

Description

Using the IndexedFileFormat yarn.nodemanager.remote-app-log-dir configured to an s3a URI throws the following exception during log aggregation:

Cannot create writer for app application_1556199768861_0001. Skip log upload this time. 
java.io.IOException: java.io.FileNotFoundException: No such file or directory: s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
	at org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: No such file or directory: s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
	at org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
	at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
	at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
	at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
	at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
	at org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
	... 7 more

This stack trace point to LogAggregationIndexedFileController$initializeWriter where we do the following steps (in a non-rolling log aggregation setup):

create FSDataOutputStream
writing out a UUID
flushing
immediately after that we call a GetFileStatus to get the length of the log file (the bytes we just wrote out), and that's where the failures happens: the file is not there yet due to eventual consistency.

Maybe we can get rid of that, so we can use IFile format against a s3a target.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

IFile-S3A-POC01.patch
20/May/19 15:51
4 kB
Peter Bacsko
YARN-9525.002.patch
04/Jun/19 16:36
5 kB
Adam Antal
YARN-9525.003.patch
05/Jun/19 07:26
5 kB
Adam Antal
YARN-9525.004.patch
13/Jun/19 15:15
5 kB
Adam Antal
YARN-9525.005.patch
14/Jun/19 08:27
5 kB
Adam Antal
YARN-9525.006.patch
07/Jan/20 11:22
7 kB
Adam Antal
YARN-9525.006.patch
06/Dec/19 16:38
7 kB
Adam Antal
YARN-9525.007.patch
13/Jan/20 10:16
7 kB
Adam Antal
YARN-9525-001.patch
20/May/19 16:36
4 kB
Peter Bacsko

Activity

People

Assignee:: Adam Antal

Reporter:: Adam Antal

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 02/May/19 16:27

Updated:: 20/Jan/20 12:08

Resolved:: 20/Jan/20 12:08