Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.3.6
-
None
-
None
Description
noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with separately.
- all seeks are lazy; no fetching is kicked off after an open
- the first read is treated as an out of order read, so cancels any active reads (don't think there are any) and then only asks for 1 block
if (outOfOrderRead) { LOG.debug("lazy-seek({})", getOffsetStr(readPos)); blockManager.cancelPrefetches(); // We prefetch only 1 block immediately after a seek operation. prefetchCount = 1; }
- for any read fully we should prefetch all blocks in the range requested
- for other reads, we may want a bigger prefech count than 1, depending on: split start/end, file read policy (random, sequential, whole-file)
- also, if a read is in a block other than the current one, but which is already being fetched or cached, is this really an OOO read to the extent that outstanding fetches should be cancelled?