Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3446

Umbrella jira for Pig on Tez

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • tez-branch
    • 0.14.0
    • tez
    • None

    Description

      This is a umbrella jira for Pig on Tez. More detailed subtasks will be added.

      More information can be found on the following wiki page:
      https://cwiki.apache.org/confluence/display/PIG/Pig+on+Tez

      How to set up your development environment-

      1. Check out Tez trunk.
      2. Install protobuf 2.5.0.
      3. Build Tez with Hadoop 2.2.0.(By default, it builds against Hadoop trunk, which is 3.0.0.)
      4. Install Tez jars on local maven repository with "mvn install -DskipTests".
      5. Check out Pig Tez branch.
      6. Build Pig running "ant jar-withouthadoop".
      7. Set up a single-node (or multi-node) Hadoop 2.2 cluster.
      8. Install Tez following the instructions on the Tez homepage.
      9. Run Pig with "-x tez" option.

      How to run Tez tests-

      • unit test
        ant test-tez
        

        By default, exectype is tez, and hadoopversion is 23 in tez branch. But you can run unit tests in mr mode as follows:

        ant test -Dexectype=mr -Dhadoopversion=20
        
      • e2e tests
        ant -Dharness.old.pig=$PIG_HOME -Dharness.hadoop.home=$HADOOP_HOME -Dharness.cluster.conf=$HADOOP_CONF -Dharness.cluster.bin=$HADOOP_BIN test-e2e-tez -Dhadoopversion=23
        

      Attachments

        Issue Links

        1.
        Move JobCreationException to org.apache.pig.backend.hadoop.executionengine Sub-task Closed Cheolsoo Park Actions
        2.
        Tez backend layout Sub-task Closed Cheolsoo Park Actions
        3.
        Add a base abstract class for ExecutionEngine Sub-task Closed Cheolsoo Park Actions
        4.
        Initial implementation of TezCompiler Sub-task Closed Cheolsoo Park Actions
        5.
        Initial implementation of TezJobControlCompiler Sub-task Closed Cheolsoo Park Actions
        6.
        Initial implementation of TezLauncher Sub-task Closed Cheolsoo Park Actions
        7.
        Initial implementation of TezStats Sub-task Closed Cheolsoo Park Actions
        8.
        Initial Implementation of PigProcessor Sub-task Closed Mark Wagner Actions
        9.
        Bump hadoop version to 2.2.0 Sub-task Closed Cheolsoo Park Actions
        10.
        Allow PigProcessor to handle multiple inputs Sub-task Closed Mark Wagner Actions
        11.
        Add TezMiniCluster for unit tests Sub-task Closed Cheolsoo Park Actions
        12.
        Empty plan fails to run Sub-task Closed Daniel Dai Actions
        13.
        Make register work Sub-task Closed Daniel Dai Actions
        14.
        Make order by work Sub-task Closed Daniel Dai Actions
        15.
        Add test-tez target to build.xml Sub-task Closed Cheolsoo Park Actions
        16.
        Make distinct work Sub-task Closed Alex Bain Actions
        17.
        Make limit work Sub-task Closed Alex Bain Actions
        18.
        Pig should be able to submit multiple DAG Sub-task Closed Daniel Dai Actions
        19.
        e2e test for tez Sub-task Closed Daniel Dai Actions
        20.
        Add diagnostic information to TezStats Sub-task Closed Cheolsoo Park Actions
        21.
        Fix tez Checkin_2 Sub-task Closed Daniel Dai Actions
        22.
        Initial implementation of combiner optimization Sub-task Closed Cheolsoo Park Actions
        23.
        Fix tez branch compilation with Hadoop 1.0 Sub-task Closed Cheolsoo Park Actions
        24.
        Implement optimizations for LIMIT Sub-task Closed Alex Bain Actions
        25.
        UniqueTez staging dir should be used for different users Sub-task Resolved Daniel Dai Actions
        26.
        Implement combiner optimizations for DISTINCT Sub-task Closed Alex Bain Actions
        27.
        Make Tez work with security Sub-task Closed Rohini Palaniswamy Actions
        28.
        Make split work with Tez Sub-task Closed Rohini Palaniswamy Actions
        29.
        Make union work with tez Sub-task Closed Cheolsoo Park Actions
        30.
        Fix dependencies in ivy.xml Sub-task Closed Cheolsoo Park Actions
        31.
        Port Package refactoring to Tez branch Sub-task Closed Mark Wagner Actions
        32.
        Fix e2e Operator_1, 5, Checkin_3, and Join_1 Sub-task Closed Cheolsoo Park Actions
        33.
        Move POSimpleTezLoad under tez package Sub-task Closed Cheolsoo Park Actions
        34.
        Tear down TezSessions when Pig exits Sub-task Closed Rohini Palaniswamy Actions
        35.
        Add counters to TezStats Sub-task Closed Cheolsoo Park Actions
        36.
        Implement replicated join in Tez Sub-task Closed Cheolsoo Park Actions
        37.
        Fix e2e tests Operators_3, Operators_5 Sub-task Closed Daniel Dai Actions
        38.
        Add order by string, descending order e2e tests Sub-task Closed Daniel Dai Actions
        39.
        Replace broadcast edges with scatter/gather edges in union Sub-task Closed Cheolsoo Park Actions
        40.
        TezCompiler adds duplicate predecessors of blocking operators to TezPlan Sub-task Closed Rohini Palaniswamy Actions
        41.
        Fix intermittent test failure Join_1 Sub-task Closed Daniel Dai Actions
        42.
        Make combiners, custom partitioners and secondary key sort work for multiple outputs Sub-task Closed Rohini Palaniswamy Actions
        43.
        Implement STREAM in Tez Sub-task Closed Alex Bain Actions
        44.
        Improve performance of order-by Sub-task Closed Daniel Dai Actions
        45.
        Make accumulator UDF work in Tez Sub-task Closed Cheolsoo Park Actions
        46.
        Implement skewed join in Tez Sub-task Closed Cheolsoo Park Actions
        47.
        Implement merge join in Tez Sub-task Closed Daniel Dai Actions
        48.
        Use Tez ObjectRegistry to cache FRJoin map and WeightedRangePartitioner map Sub-task Closed Rohini Palaniswamy Actions
        49.
        TEZ-41 break pig-tez Sub-task Closed Daniel Dai Actions
        50.
        Fix store after load Sub-task Closed Daniel Dai Actions
        51.
        Fix skewed join e2e tests Sub-task Closed Cheolsoo Park Actions
        52.
        Change tez version dependency as a result of TEZ-739 Sub-task Closed Hitesh Shah Actions
        53.
        Fix split + skewed join Sub-task Resolved Rohini Palaniswamy Actions
        54.
        Fix TestSkewedJoin in tez mode Sub-task Closed Cheolsoo Park Actions
        55.
        Use ONE_TO_ONE edge and IdentityInOut in orderby intermediate vertex Sub-task Closed Rohini Palaniswamy Actions
        56.
        Set MR runtime settings on tez runtime Sub-task Closed Rohini Palaniswamy Actions
        57.
        Use VertexGroup and Alias vertex for union Sub-task Closed Cheolsoo Park Actions
        58.
        Support for multiquery off in Tez Sub-task Closed Rohini Palaniswamy Actions
        59.
        Add support for non-Java UDF's Sub-task Closed Alex Bain Actions
        60.
        Make scalar work Sub-task Closed Daniel Dai Actions
        61.
        Fix desc order by in Tez Sub-task Closed Daniel Dai Actions
        62.
        CombinerOptimizer should not optimize cogroup case in tez Sub-task Closed Daniel Dai Actions
        63.
        Outer join fail on tez Sub-task Closed Daniel Dai Actions
        64.
        Properties aren't propagated to edges or vertices in Tez Sub-task Resolved Mark Wagner Actions
        65.
        Use ONE_TO_ONE edge and IdentityInOut in skewed join intermediate vertex Sub-task Closed Rohini Palaniswamy Actions
        66.
        Work with TEZ-668 which allows starting and closing of inputs and outputs Sub-task Closed Rohini Palaniswamy Actions
        67.
        TezCompiler.visitUnion() doesn't add compiled TezOp to phyToTezOpMap Sub-task Closed Cheolsoo Park Actions
        68.
        Scripting UDF is broken after PIG-3629 Sub-task Closed Daniel Dai Actions
        69.
        Tez mini cluster tests run for a very long time with TezSession reuse on Sub-task Closed Cheolsoo Park Actions
        70.
        Fix TestTezCompiler#testReplicatedJoinInReducer Sub-task Closed Cheolsoo Park Actions
        71.
        TezResourceManager should not be a singleton Sub-task Closed Daniel Dai Actions
        72.
        POReservoirSample should handle endOfAllInput flag Sub-task Closed Daniel Dai Actions
        73.
        Multiquery with FRJoin fail Sub-task Closed Daniel Dai Actions
        74.
        NPE when POStream is not in the leaf vertex Sub-task Closed Daniel Dai Actions
        75.
        tuple in POStream binaryInputQueue keep changing Sub-task Closed Daniel Dai Actions
        76.
        Several changes in Tez e2e Sub-task Closed Daniel Dai Actions
        77.
        POValueInputTez should handle getNextTuple even after reader.next() returns null Sub-task Closed Daniel Dai Actions
        78.
        Parallelism specified by user is not honored if default parallelism is set to a higher value Sub-task Closed Cheolsoo Park Actions
        79.
        Fix some memory leaks affecting container reuse Sub-task Closed Rohini Palaniswamy Actions
        80.
        TestCustomPartitioner is broken in tez branch Sub-task Closed Cheolsoo Park Actions
        81.
        POPoissonSample should handle endOfAllInput flag Sub-task Closed Daniel Dai Actions
        82.
        Implement CROSS in Tez Sub-task Closed Rohini Palaniswamy Actions
        83.
        Implement RANK in Tez Sub-task Closed Rohini Palaniswamy Actions
        84.
        Remove reference to BroadcastKVReader as it is removed in TEZ-911 Sub-task Closed Rohini Palaniswamy Actions
        85.
        Make custom counter work Sub-task Closed Daniel Dai Actions
        86.
        Implement mapside cogroup in Tez Sub-task Resolved Unassigned Actions
        87.
        Organize tez code into subpackages Sub-task Closed Rohini Palaniswamy Actions
        88.
        Honor Mapreduce Distributed Cache settings and localize resources in Tez Sub-task Closed Rohini Palaniswamy Actions
        89.
        Pig on tez job hangs when AM has a failure and Multiquery fixes Sub-task Closed Rohini Palaniswamy Actions
        90.
        Pig script encounters error with Tez MemoryDistributor Sub-task Resolved Unassigned Actions
        91.
        e2e test Rank_9 fail Sub-task Closed Daniel Dai Actions
        92.
        e2e tests run all tests even execonly flag does not match Sub-task Closed Daniel Dai Actions
        93.
        Fix MergeJoin_8 failure Sub-task Closed Daniel Dai Actions
        94.
        UdfDistributedCache_1 fails in tez branch Sub-task Closed Cheolsoo Park Actions
        95.
        Global sort is not working (order by) Pig over Tez Sub-task Resolved Unassigned Actions
        96.
        Hash join followed by replicated join fails in Tez mode Sub-task Closed Cheolsoo Park Actions
        97.
        Fix memory leak with PigTezLogger Sub-task Closed Rohini Palaniswamy Actions
        98.
        Fix UnionOptimizer bug with expressions and MR compressions settings not honored Sub-task Closed Rohini Palaniswamy Actions
        99.
        Implement PPNL for Tez mode (Pig side changes) Sub-task Closed Cheolsoo Park Actions
        100.
        PigRecordWriter throws exception in Tez mode Sub-task Closed Cheolsoo Park Actions
        101.
        Fix e2e test failure CastScalar_11 Sub-task Closed Daniel Dai Actions
        102.
        Skewed join followed by replicated join fails in Tez Sub-task Closed Cheolsoo Park Actions
        103.
        Fix MR unit tests on tez branch Sub-task Closed Daniel Dai Actions
        104.
        Pig on tez fails to run in Oozie in secure cluster Sub-task Closed Rohini Palaniswamy Actions
        105.
        Get TezStats working for Oozie Sub-task Closed Rohini Palaniswamy Actions
        106.
        Make the interval of DAGStatus report configurable Sub-task Closed Cheolsoo Park Actions
        107.
        New interface for resetting static variables for jvm reuse Sub-task Closed Rohini Palaniswamy Actions
        108.
        Fix compilation failure due in Pig on Tez due to TEZ-1127 change Sub-task Resolved Unassigned Actions
        109.
        Refactor TezJob and TezLauncher Sub-task Closed Cheolsoo Park Actions
        110.
        Make Streaming UDF work in Tez Sub-task Closed Daniel Dai Actions
        111.
        ObjectCache cause ClassCastException Sub-task Closed Cheolsoo Park Actions
        112.
        Change from TezJobConfig to TezRuntimeConfiguration Sub-task Closed Rohini Palaniswamy Actions
        113.
        Accumulator UDF throws OOM in Tez Sub-task Closed Rohini Palaniswamy Actions
        114.
        NPE in packager when union + group-by followed by replicated join in Tez Sub-task Closed Rohini Palaniswamy Actions
        115.
        Implement merge cogroup in Tez Sub-task Closed Daniel Dai Actions
        116.
        Add Native operator to tez Sub-task Closed Daniel Dai Actions
        117.
        Create a target to run mr and tez unit test in one shot Sub-task Closed Daniel Dai Actions
        118.
        Pin Tez to 0.5.0 release Sub-task Closed Cheolsoo Park Actions
        119.
        Intermediate reducer parallelism in Tez should be higher Sub-task Closed Rohini Palaniswamy Actions
        120.
        Mapreduce ACLs should be translated to Tez ACLs Sub-task Closed Rohini Palaniswamy Actions
        121.
        Reset UDFContext state before OutputCommitter invocations in Tez Sub-task Closed Rohini Palaniswamy Actions
        122.
        Fix few issues related to Union, CROSS and auto parallelism in Tez Sub-task Closed Rohini Palaniswamy Actions
        123.
        PigProcessor does not set pig.datetime.default.tz Sub-task Closed Rohini Palaniswamy Actions
        124.
        ObjectCache should use ProcessorContext.getObjectRegistry() Sub-task Closed Rohini Palaniswamy Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            cheolsoo Cheolsoo Park
            cheolsoo Cheolsoo Park
            Votes:
            0 Vote for this issue
            Watchers:
            29 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment