1. Pig
  2. PIG-3839

Umbrella jira for Pig on Tez Performance Improvements


    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 0.16.0
    • Component/s: tez
    • Labels:


      Separating out performance improvements from PIG-3446 which is the main jira for Pig on Tez.

        Issue Links

        1. Improve performance of replicate-join Sub-task Open Unassigned
        Improve performance of union Sub-task Closed Rohini Palaniswamy
        3. Use unsorted shuffle in Orderby, Skewed Join to improve performance in Tez Sub-task Open Rohini Palaniswamy
        4. Use shared edge with no multiquery Sub-task Open Unassigned
        Implement automatic reducer parallelism Sub-task Closed Daniel Dai
        6. Sort avoidance for group by and join Sub-task Open Unassigned
        7. Dynamically switch to replicate join Sub-task Open Unassigned
        8. Integrate YSmart into Pig on tez Sub-task Open Unassigned
        9. Optimize join followed by order by using same key Sub-task Open Unassigned
        10. Simplify plan of Limit on Tez Sub-task Open Unassigned
        11. UnionOptimizer in Tez should optimize the case of replicated join Sub-task Open Unassigned
        Handle two outputs from split going to same input in MultiQueryOptimizer Sub-task Resolved Rohini Palaniswamy
        Improve parallelism of union and join Sub-task Resolved Rohini Palaniswamy
        Use unsorted shuffle in Union Sub-task Closed Rohini Palaniswamy
        15. Improve performance of Limit following an Orderby on Tez Sub-task Open Unassigned
        Limit reduce task should start as soon as one map task finishes Sub-task Closed Rohini Palaniswamy
        17. Broadcast the index file in case of POMergeCoGroup and POMergeJoin Sub-task Open Unassigned
        18. Rework Hash based aggregation for Tez Sub-task Open Unassigned
        1-1 edge vertices should use same jvm opts Sub-task Closed Rohini Palaniswamy
        20. Size estimation should be done in sampler instead of sample aggregator Sub-task Open Unassigned
        21. Enhance Tez AM size estimation Sub-task Open Unassigned
        Better multi-query planning in case of multiple edges Sub-task Resolved Rohini Palaniswamy
        23. Eliminate identity vertex for order by and skewed join right after LOAD Sub-task Open Unassigned


          Rohini Palaniswamy created issue -
          Rohini Palaniswamy made changes -
          Field Original Value New Value
          Link This issue is related to PIG-3446 [ PIG-3446 ]
          Daniel Dai made changes -
          Component/s tez [ 12321016 ]
          Daniel Dai made changes -
          Fix Version/s 0.14.0 [ 12326954 ]
          Fix Version/s tez-branch [ 12324968 ]
          Daniel Dai made changes -
          Fix Version/s 0.15.0 [ 12328760 ]
          Fix Version/s 0.14.0 [ 12326954 ]
          Daniel Dai made changes -
          Fix Version/s 0.16.0 [ 12332168 ]
          Fix Version/s 0.15.0 [ 12328760 ]


            • Assignee:
              Rohini Palaniswamy
            • Votes:
              0 Vote for this issue
              5 Start watching this issue


              • Created: