Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
1.5.0
-
None
-
None
-
- HBase Version: 1.2.0-cdh5.11.1 (the line that deletes the file still exists)
- hadoop version
- Hadoop 2.6.0-cdh5.11.1
- Subversion http://github.com/cloudera/hadoop -r b581c269ca3610c603b6d7d1da0d14dfb6684aa3
- From source with checksum c6cbc4f20a8a571dd7c9f743984da1
- This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.11.1.jar
HBase Version: 1.2.0-cdh5.11.1 (the line that deletes the file still exists) hadoop version Hadoop 2.6.0-cdh5.11.1 Subversion http://github.com/cloudera/hadoop -r b581c269ca3610c603b6d7d1da0d14dfb6684aa3 From source with checksum c6cbc4f20a8a571dd7c9f743984da1 This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.11.1.jar
-
Patch, Important
-
Description
Hi team, we have a MapReduce job that uses the bulkload option instead of direct puts to import data e.g.,
HFileOutputFormat2.configureIncrementalLoad(job, table, locator);
However we have been running into a situation where partitions file is deleted by the termination of the JVM process, where JVM process kicks off the MapReduce job but it's also waiting to run the `configureIncrementalLoad` that executes the configurePartitioner.
Error: java.lang.IllegalArgumentException: Can't read partitions file at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
We think the line#827 of HFileOutputFormat2 could be the root cause.
fs.deleteOnExit(partitionsPath);
We have created our custom HFileOutputFormat that doesn't delete the partitions file and have fixed the problem for our cluster. We propose that a cleanup method could be created which deletes the partitions file once all the mappers have finished.