Details
-
Sub-task
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
3.3.0
-
None
-
None
Description
HADOOP-15999 modifies the S3Guard's non-authoritative mode, so when S3Guard runs non-authoritative, every fs.getFileStatus will check S3 because we don't handle the MetadataStore as a single source of truth. This has a negative performance impact.
In other words HADOOP-15999 is going to reinstate the HEAD on every read, so making non-auth S3Guard a bit slower. We could think about addressing that by moving the checks into the input stream itself. That is: the first GET which returns data will also act as the metadata check. That'd mean the read context will need updating with some "metastoreProcessHeader" callback to invoke on the first GET.
The good news is that because it's reading a file, its only one HTTP HEAD request: no need to do any of the other two directory probes except in the case that the file isn't there.
Attachments
Issue Links
- is related to
-
HADOOP-15999 S3Guard: Better support for out-of-band operations
- Resolved