Description
The parent directory of input file was used to determine the path of base directory for partition files. The problem is when input is multiple files.
protected BSPJob partition(BSPJob job, int maxTasks) throws IOException { String inputPath = job.getConfiguration().get(Constants.JOB_INPUT_DIR); Path inputDir = new Path(inputPath); if (fs.isFile(inputDir)) { inputDir = inputDir.getParent(); } Path partitionDir = new Path(inputDir + "/partitions"); if (fs.exists(partitionDir)) { fs.delete(partitionDir, true); }
Simply we can create partitions on temp directory. For example, /tmp/hama-partitions/{$JOB_NAME}/