The last split generated by FileInputFormat.getSplits considers blkLocations.length-1 to be the hosts for the split.
The last split may be larger than the rest (SPLIT_SLOP=1.1 by default) - in which case locality is picked up from a smaller block.
e.g. 1027MB file with a 128MB split size. The last split ends up being 131MB. The hosts for locality end up being the nodes containing the 3MB block instead of the 128MB block.
|Status||Open [ 1 ]||Patch Available [ 10002 ]|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Hadoop Flags||Reviewed [ 10343 ]|
|Release Note||Improved FileInputFormat to return better locality for the last split.|
|Fix Version/s||0.23.1 [ 12318883 ]|
|Resolution||Fixed [ 1 ]|
|Status||Resolved [ 5 ]||Closed [ 6 ]|