Description
CombineFileInputFormat.getSplits creates splits with duplicate locations. It adds locations of the files in the split to an ArrayList; if all the files are on same location, the location is added again and again. Instead, it should add it to a Set instead of List to avoid duplicates.
Attachments
Attachments
Issue Links
- is duplicated by
-
MAPREDUCE-4593 CombineFileInputFormat must ensure it doesn't dupe locations in its InputSplit objects
- Resolved
- relates to
-
HIVE-3387 meta data file size exceeds limit
- Closed