Description
Many of our classes extending InputFormat have
/* * This is to support multi-level/recursive directory listing until * MAPREDUCE-1577 is fixed. */ @Override protected List<FileStatus> listStatus(JobContext job) throws IOException { return MapRedUtil.getAllFileRecursively(super.listStatus(job), job.getConfiguration()); }
Now that we have dropped Hadoop 1.x, it can be optimized to
if (getInputDirRecursive(job)) { return super.listStatus(job); } else { /* * mapreduce.input.fileinputformat.input.dir.recursive is not true * by default for backward compatibility reasons. */ return MapRedUtil.getAllFileRecursively(super.listStatus(job), job.getConfiguration()); }
That would avoid one extra iteration when mapreduce.input.fileinputformat.input.dir.recursive is set to true by users.