Description
MultiQueryOptimizerTez.java
// Detect diamond shape, we cannot merge it into split, since Tez // does not handle double edge between vertexes boolean sharedSucc = false; if (getPlan().getSuccessors(successor)!=null) { for (TezOperator succ_successor : getPlan().getSuccessors(successor)) { if (succ_successors.contains(succ_successor)) { sharedSucc = true; break; } } succ_successors.addAll(getPlan().getSuccessors(successor)); }
SPLIT A INTO B if <condition>, C if <condition>;
D = JOIN B by x, C by x;
We would like to do
V1 - Split (B -> V2, C -> V2)
V2 - Join B and C
Without the check for shared successors, above plan is created but B and C create two separate edges between V1 and V2 which is not supported by Tez. Since the splits are not merged into POSplit fully, we currently have
V1 - Split ( B-> V3, C-> V2 with just POValueOutputTez)
V2 - LocalRearrange and -> V3
V3 - Join B and C
We need to remove the check and merge them into the POSplit and fix this case to make B and C both write to same edge. Being more aggressive in multi-query increases performance.