Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3839 Umbrella jira for Pig on Tez Performance Improvements
  3. PIG-3876

Handle two outputs from split going to same input in MultiQueryOptimizer

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • 0.15.0
    • tez
    • None

    Description

      MultiQueryOptimizerTez.java

      // Detect diamond shape, we cannot merge it into split, since Tez
                      // does not handle double edge between vertexes
                      boolean sharedSucc = false;
                      if (getPlan().getSuccessors(successor)!=null) {
                          for (TezOperator succ_successor : getPlan().getSuccessors(successor)) {
                              if (succ_successors.contains(succ_successor)) {
                                  sharedSucc = true;
                                  break;
                              }
                          }
                          succ_successors.addAll(getPlan().getSuccessors(successor));
                      }
      

      SPLIT A INTO B if <condition>, C if <condition>;
      D = JOIN B by x, C by x;

      We would like to do
      V1 - Split (B -> V2, C -> V2)
      V2 - Join B and C

      Without the check for shared successors, above plan is created but B and C create two separate edges between V1 and V2 which is not supported by Tez. Since the splits are not merged into POSplit fully, we currently have

      V1 - Split ( B-> V3, C-> V2 with just POValueOutputTez)
      V2 - LocalRearrange and -> V3
      V3 - Join B and C

      We need to remove the check and merge them into the POSplit and fix this case to make B and C both write to same edge. Being more aggressive in multi-query increases performance.

      Attachments

        Issue Links

          Activity

            People

              rohini Rohini Palaniswamy
              rohini Rohini Palaniswamy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: