Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
Description
Pig stoped at progress 56%. It seemed there had been exceptions on the the reduce task trackers (see below). But waiting for 20 reduce tasks to time out themselves is excruciating and blocking my other tests.
Here is my Pig script:
define X `DataGuaranteeTest.pl -n 1` ship('/home/xu/streamingscript/DataGuaranteeTest.pl'); A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa); B = group A by name; C = foreach B generate flatten(A); D = stream C through X; store D into 'results_24';
Here is the exception on the reduce task trackers:
java.lang.RuntimeException: java.io.IOException: Cannot run program "./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2, No such file or directory at org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132) at org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91) at org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51) at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123) at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303) at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
I will attach DataGuaranteeTest.pl to the report