Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
3.3.0
-
None
Description
HADOOP-16202 adds standard options for declaring the read/seek
Policy when reading a file. These should be set to sequential IO
When localising resources, so that if the default/cluster settings
For a file system are optimized for random IO, artifact downloads
are still read at the maximum speed possible (one big GET to the EOF).
Most of this happens in hadoop-common, but some tuning of FSDownload
can assist
- tar/jar download must also be sequential
- if the FileStatus is passed around, that can be used
in the open request to skip checks when loading the file.
Together this can save 3 HEAD requests per resource, with the sequential
IO avoiding any splitting of the big read into separate block GETs
Attachments
Issue Links
- depends upon
-
HADOOP-16202 Enhance openFile() for better read performance against object stores
- Resolved
- links to