Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: tez-branch
    • Fix Version/s: tez-branch
    • Component/s: tez
    • Labels:
      None

      Description

      This is a umbrella jira for Pig on Tez. More detailed subtasks will be added.

      More information can be found on the following wiki page:
      https://cwiki.apache.org/confluence/display/PIG/Pig+on+Tez

      How to set up your development environment-

      1. Check out Tez trunk.
      2. Install protobuf 2.5.0.
      3. Build Tez with Hadoop 2.2.0.(By default, it builds against Hadoop trunk, which is 3.0.0.)
      4. Install Tez jars on local maven repository with "mvn install -DskipTests".
      5. Check out Pig Tez branch.
      6. Build Pig running "ant jar-withouthadoop".
      7. Set up a single-node (or multi-node) Hadoop 2.2 cluster.
      8. Install Tez following the instructions on the Tez homepage.
      9. Run Pig with "-x tez" option.

      How to run Tez tests-

      • unit test
        ant test-tez
        

        By default, exectype is tez, and hadoopversion is 23 in tez branch. But you can run unit tests in mr mode as follows:

        ant test -Dexectype=mr -Dhadoopversion=20
        
      • e2e tests
        ant -Dharness.old.pig=$PIG_HOME -Dharness.hadoop.home=$HADOOP_HOME -Dharness.cluster.conf=$HADOOP_CONF -Dharness.cluster.bin=$HADOOP_BIN test-e2e-tez -Dhadoopversion=23
        

        Issue Links

        1.
        Tez backend layout Sub-task Resolved Cheolsoo Park
         
        2.
        Move JobCreationException to org.apache.pig.backend.hadoop.executionengine Sub-task Resolved Cheolsoo Park
         
        3.
        Add a base abstract class for ExecutionEngine Sub-task Resolved Cheolsoo Park
         
        4.
        Initial implementation of TezCompiler Sub-task Resolved Cheolsoo Park
         
        5.
        Initial implementation of TezJobControlCompiler Sub-task Resolved Cheolsoo Park
         
        6.
        Initial implementation of TezLauncher Sub-task Resolved Cheolsoo Park
         
        7.
        Initial implementation of TezStats Sub-task Resolved Cheolsoo Park
         
        8.
        Initial Implementation of PigProcessor Sub-task Resolved Mark Wagner
         
        9.
        Bump hadoop version to 2.2.0 Sub-task Resolved Cheolsoo Park
         
        10.
        Allow PigProcessor to handle multiple inputs Sub-task Resolved Mark Wagner
         
        11.
        Add TezMiniCluster for unit tests Sub-task Resolved Cheolsoo Park
         
        12.
        Empty plan fails to run Sub-task Resolved Daniel Dai
         
        13.
        Make register work Sub-task Resolved Daniel Dai
         
        14.
        Make order by work Sub-task Resolved Daniel Dai
         
        15.
        Add test-tez target to build.xml Sub-task Resolved Cheolsoo Park
         
        16.
        Make distinct work Sub-task Resolved Alex Bain
         
        17.
        Make limit work Sub-task Resolved Alex Bain
         
        18.
        Pig should be able to submit multiple DAG Sub-task Resolved Daniel Dai
         
        19.
        e2e test for tez Sub-task Resolved Daniel Dai
         
        20.
        Add diagnostic information to TezStats Sub-task Resolved Cheolsoo Park
         
        21.
        Fix tez Checkin_2 Sub-task Resolved Daniel Dai
         
        22.
        Initial implementation of combiner optimization Sub-task Resolved Cheolsoo Park
         
        23.
        Fix tez branch compilation with Hadoop 1.0 Sub-task Resolved Cheolsoo Park
         
        24.
        Implement optimizations for LIMIT Sub-task Resolved Alex Bain
         
        25.
        UniqueTez staging dir should be used for different users Sub-task Resolved Daniel Dai
         
        26.
        Implement combiner optimizations for DISTINCT Sub-task Resolved Alex Bain
         
        27.
        Make Tez work with security Sub-task Resolved Rohini Palaniswamy
         
        28.
        Make split work with Tez Sub-task Resolved Rohini Palaniswamy
         
        29.
        Make union work with tez Sub-task Resolved Cheolsoo Park
         
        30.
        Fix dependencies in ivy.xml Sub-task Resolved Cheolsoo Park
         
        31.
        Port Package refactoring to Tez branch Sub-task Resolved Mark Wagner
         
        32.
        Fix e2e Operator_1, 5, Checkin_3, and Join_1 Sub-task Resolved Cheolsoo Park
         
        33.
        Move POSimpleTezLoad under tez package Sub-task Resolved Cheolsoo Park
         
        34.
        Tear down TezSessions when Pig exits Sub-task Resolved Rohini Palaniswamy
         
        35.
        Add counters to TezStats Sub-task Resolved Cheolsoo Park
         
        36.
        Implement replicated join in Tez Sub-task Resolved Cheolsoo Park
         
        37.
        Fix e2e tests Operators_3, Operators_5 Sub-task Resolved Daniel Dai
         
        38.
        Add order by string, descending order e2e tests Sub-task Resolved Daniel Dai
         
        39.
        Replace broadcast edges with scatter/gather edges in union Sub-task Resolved Cheolsoo Park
         
        40.
        TezCompiler adds duplicate predecessors of blocking operators to TezPlan Sub-task Resolved Rohini Palaniswamy
         
        41.
        Fix intermittent test failure Join_1 Sub-task Resolved Daniel Dai
         
        42.
        Make combiners, custom partitioners and secondary key sort work for multiple outputs Sub-task Resolved Rohini Palaniswamy
         
        43.
        Implement STREAM in Tez Sub-task Resolved Alex Bain
         
        44.
        Improve performance of order-by Sub-task Resolved Daniel Dai
         
        45.
        Make accumulator UDF work in Tez Sub-task Resolved Cheolsoo Park
         
        46.
        Implement skewed join in Tez Sub-task Resolved Cheolsoo Park
         
        47.
        Implement merge join in Tez Sub-task Resolved Daniel Dai
         
        48.
        Use Tez ObjectRegistry to cache FRJoin map and WeightedRangePartitioner map Sub-task Resolved Rohini Palaniswamy
         
        49. Memory management for each vertex Sub-task Open Unassigned
         
        50.
        TEZ-41 break pig-tez Sub-task Resolved Daniel Dai
         
        51.
        Fix store after load Sub-task Resolved Daniel Dai
         
        52. Implement user defined comparators for order by in Tez Sub-task Open Rohini Palaniswamy
         
        53.
        Fix skewed join e2e tests Sub-task Resolved Cheolsoo Park
         
        54.
        Change tez version dependency as a result of TEZ-739 Sub-task Resolved Hitesh Shah
         
        55.
        Fix split + skewed join Sub-task Resolved Rohini Palaniswamy
         
        56.
        Fix TestSkewedJoin in tez mode Sub-task Resolved Cheolsoo Park
         
        57.
        Use ONE_TO_ONE edge and IdentityInOut in orderby intermediate vertex Sub-task Resolved Rohini Palaniswamy
         
        58.
        Set MR runtime settings on tez runtime Sub-task Resolved Rohini Palaniswamy
         
        59.
        Use VertexGroup and Alias vertex for union Sub-task Resolved Cheolsoo Park
         
        60.
        Support for multiquery off in Tez Sub-task Resolved Rohini Palaniswamy
         
        61. Generating Splits in Tez should be configurable to AM or client Sub-task Open Rohini Palaniswamy
         
        62.
        Add support for non-Java UDF's Sub-task Resolved Alex Bain
         
        63.
        Make scalar work Sub-task Resolved Daniel Dai
         
        64.
        Fix desc order by in Tez Sub-task Resolved Daniel Dai
         
        65.
        CombinerOptimizer should not optimize cogroup case in tez Sub-task Resolved Daniel Dai
         
        66.
        Outer join fail on tez Sub-task Resolved Daniel Dai
         
        67.
        Properties aren't propagated to edges or vertices in Tez Sub-task Resolved Mark Wagner
         
        68.
        Use ONE_TO_ONE edge and IdentityInOut in skewed join intermediate vertex Sub-task Resolved Rohini Palaniswamy
         
        69.
        Work with TEZ-668 which allows starting and closing of inputs and outputs Sub-task Resolved Rohini Palaniswamy
         
        70.
        TezCompiler.visitUnion() doesn't add compiled TezOp to phyToTezOpMap Sub-task Resolved Cheolsoo Park
         
        71.
        Scripting UDF is broken after PIG-3629 Sub-task Resolved Daniel Dai
         
        72.
        Tez mini cluster tests run for a very long time with TezSession reuse on Sub-task Resolved Cheolsoo Park
         
        73.
        Fix TestTezCompiler#testReplicatedJoinInReducer Sub-task Resolved Cheolsoo Park
         
        74.
        Port more mini cluster tests to Tez Sub-task Resolved Cheolsoo Park
         
        75.
        TezResourceManager should not be a singleton Sub-task Resolved Daniel Dai
         
        76.
        POReservoirSample should handle endOfAllInput flag Sub-task Resolved Daniel Dai
         
        77.
        Multiquery with FRJoin fail Sub-task Resolved Daniel Dai
         
        78.
        NPE when POStream is not in the leaf vertex Sub-task Resolved Daniel Dai
         
        79.
        tuple in POStream binaryInputQueue keep changing Sub-task Resolved Daniel Dai
         
        80.
        Several changes in Tez e2e Sub-task Resolved Daniel Dai
         
        81.
        Port more mini cluster tests to Tez - part2 Sub-task Resolved Cheolsoo Park
         
        82.
        POValueInputTez should handle getNextTuple even after reader.next() returns null Sub-task Resolved Daniel Dai
         
        83.
        Parallelism specified by user is not honored if default parallelism is set to a higher value Sub-task Resolved Cheolsoo Park
         
        84.
        Fix some memory leaks affecting container reuse Sub-task Resolved Rohini Palaniswamy
         
        85.
        TestCustomPartitioner is broken in tez branch Sub-task Resolved Cheolsoo Park
         
        86.
        POPoissonSample should handle endOfAllInput flag Sub-task Resolved Daniel Dai
         
        87.
        Implement CROSS in Tez Sub-task Resolved Rohini Palaniswamy
         
        88.
        Implement RANK in Tez Sub-task Resolved Rohini Palaniswamy
         
        89.
        Remove reference to BroadcastKVReader as it is removed in TEZ-911 Sub-task Resolved Rohini Palaniswamy
         
        90.
        Make custom counter work Sub-task Resolved Daniel Dai
         
        91. Implement mapside cogroup in Tez Sub-task Open Unassigned
         
        92. Organize tez code into subpackages Sub-task Open Unassigned
         
        93.
        Honor Mapreduce Distributed Cache settings and localize resources in Tez Sub-task Resolved Rohini Palaniswamy
         
        94.
        Pig on tez job hangs when AM has a failure and Multiquery fixes Sub-task Resolved Rohini Palaniswamy
         
        95.
        Pig script encounters error with Tez MemoryDistributor Sub-task Resolved Unassigned
         
        96. Remove SecurityHelper class for Tez and use Tez helpers instead Sub-task Open Unassigned
         
        97. PigStatusReporter.getInstance().getCounter() returns null in Tez mode Sub-task Open Cheolsoo Park
         
        98.
        e2e test Rank_9 fail Sub-task Resolved Daniel Dai
         
        99.
        e2e tests run all tests even execonly flag does not match Sub-task Resolved Daniel Dai
         
        100.
        Fix MergeJoin_8 failure Sub-task Resolved Daniel Dai
         
        101. UdfDistributedCache_1 fails in tez branch Sub-task Open Unassigned
         
        102. Global sort is not working (order by) Pig over Tez Sub-task Open Unassigned
         
        103.
        Hash join followed by replicated join fails in Tez mode Sub-task Resolved Cheolsoo Park
         
        104. Fix memory leak with PigTezLogger Sub-task Patch Available Rohini Palaniswamy
         

          Activity

            People

            • Assignee:
              Cheolsoo Park
              Reporter:
              Cheolsoo Park
            • Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

              • Created:
                Updated:

                Development