[HADOOP-18391] Improve VectoredReadUtils#readVectored() for direct buffers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.3.5
Fix Version/s: 3.3.5
Component/s: fs
Labels:
- pull-request-available

Description

harden the VectoredReadUtils methods for consistent and more robust use, especially in those filesystems which don't have the api.

VectoredReadUtils.readInDirectBuffer should allocate a max buffer size, .e.g 4mb, then do repeated reads and copies; this ensures that you don't OOM with many threads doing ranged requests. other libs do this.

readVectored to call validateNonOverlappingAndReturnSortedRanges before iterating

this ensures the abfs/s3a requirements are always met, and that because ranges will be read in order, prefetching by other clients will keep their performance good.

readVectored to add special handling for 0 byte ranges

Attachments

Issue Links

relates to

HADOOP-11867 Add a high-performance vectored read API.

Resolved

links to

GitHub Pull Request #4787

Activity

People

Assignee:: Mukund Thakur

Reporter:: Steve Loughran

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 04/Aug/22 17:14

Updated:: 22/Dec/22 18:49

Resolved:: 31/Aug/22 16:47