Description
HFileOutputFormat.configureIncrementalLoad is not setting all the io serializations correctly so custom bulk loaders (like ImportTsv) need to set it. Simple fix, just add what's missing.
In case someone doesn't get this patch but upgrades to an old RC, this is what you'll see when running via YARN (or similar if MR1):
2013-10-10 09:48:13,836 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.NullPointerException at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:989) at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390) at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:79) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:746) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1485) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
Kudos to jarcec for finding the issue.
Attachments
Attachments
Issue Links
- is related to
-
SQOOP-1032 Add the --bulk-load-dir option to support the HBase doBulkLoad function
- Resolved