Description
The following script fail to run:
rmf ooo a = load 'student.txt' as (name:chararray, age:int, gpa:double); b = filter a by age > 65; c = filter a by age <=10; d = union b, c; e = join a by name left, d by name; store e into 'ooo';
Exception stack:
Caused by: java.lang.IllegalArgumentException: Edge [scope-43 : org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor] -> [scope-55 : org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor] ({ SCATTER_GATHER : org.apache.tez.runtime.library.input.OrderedGroupedKVInput >> PERSISTED >> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput >> NullEdgeManager }) already defined! at org.apache.tez.dag.api.DAG.addEdge(DAG.java:272) at org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder.visitTezOp(TezDagBuilder.java:311) at org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:252) at org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:56) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.buildDAG(TezJobCompiler.java:65) at org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler.getJob(TezJobCompiler.java:111) ... 20 more
Disable pig.tez.opt.union the script runs fine.
Seems we shall detect this patten and disallow merge vertex group into a pair already has an edge.