Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
0.21.0
-
None
-
None
-
Reviewed
Description
When running MultipleInputs against the new API, we get failures with this ClassCastException:
java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit cannot be cast to org.apache.hadoop.mapreduce.lib.input.FileSplit
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:70)
at org.apache.hadoop.mapreduce.lib.input.KeyValueLineRecordReader.initialize(KeyValueLineRecordReader.java:59)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:439)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:599)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:257)
The unit test for MultipleInputs doesn't actually run a job so this snuck through while still passing the unit test. Attached patch fixes the unit test to expose the failure and does a little casting kung-fu in LineRecordReader to avoid the error.