Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
3.3.0
-
None
-
None
Description
A big hadoop -fs copyFromLocal is showing that 404 cacheing is still happening.
20/02/13 01:02:18 WARN s3a.S3AFileSystem: Failed to find file s3a://dilbert/dogbert/queries_split_1/catberg.q.COPYING. Either it is not yet visible, or it has been deleted.
0/02/13 01:02:18 WARN s3a.S3AFileSystem: Failed to find file s3a://dilbert/dogbert/queries_split_1/catberg.q.COPYING. Either it is not yet visible, or it has been deleted.
We are recovering (good) but it's (a) got the people running this code worried and (b) shouldn't be happening.
Proposed
- error message to -> to a wiki link to a (new) doc on the topic.
- retried clause to increment counter & if count >1 report on #of attempts and duration
- S3A FS.deleteOnExit to avoid all checks
- and review the copyFromLocal to make sure no other probes are happening'