Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
0.6.4
-
None
-
None
Description
I've create multiple input files considering max task capacity of cluster, but it wasn't able to run. Because, currently file splits are determined based on number of blocks.
I don't know why below code has been removed. What if add this again?
// take the short circuit path if we have already partitioned if (numSplits == files.length) { for (FileStatus file : files) { if (file != null) { splits.add(new FileSplit(file.getPath(), 0, file.getLen(), new String[0])); } } return splits.toArray(new FileSplit[splits.size()]); }
https://www.mail-archive.com/commits@hama.apache.org/msg00319.html