CombineFileInputFormat.getSplits creates splits with duplicate locations. It adds locations of the files in the split to an ArrayList; if all the files are on same location, the location is added again and again. Instead, it should add it to a Set instead of List to avoid duplicates.
- is duplicated by
MAPREDUCE-4593 CombineFileInputFormat must ensure it doesn't dupe locations in its InputSplit objects
- relates to
HIVE-3387 meta data file size exceeds limit