Details
Description
The last split generated by FileInputFormat.getSplits considers blkLocations.length-1 to be the hosts for the split.
The last split may be larger than the rest (SPLIT_SLOP=1.1 by default) - in which case locality is picked up from a smaller block.
e.g. 1027MB file with a 128MB split size. The last split ends up being 131MB. The hosts for locality end up being the nodes containing the 3MB block instead of the 128MB block.
Attachments
Attachments
Issue Links
- is related to
-
MAPREDUCE-3524 Scan benchmark is more than 1.5x slower in 0.23
- Resolved