Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: tez-branch
    • Fix Version/s: tez-branch
    • Component/s: tez
    • Labels:
      None

      Description

      Implement TezCompiler that compiles physical plan into tez plan. To begin with, we can implement the initial version that works for basic queries as follows:

      1. Load-Filter-Store
        a = load 'file:///tmp/input' as (x:int, y:int);
        b = filter a by x > 0;
        c = foreach b generate y;
        store c into 'file:///tmp/output';
        
      2. Load-Filter-GroupBy-Store
        a = load 'file:///tmp/input' as (x:int, y:int);
        b = group a by x;
        c = foreach b generate group, a;
        store c into 'file:///tmp/output';
        
      3. Load1-Load2-Join-Store
        a = load 'file:///tmp/input1' as (x:int, y:int);
        b = load 'file:///tmp/input2' as (x:int, z:int);
        c = join a by x, b by x;
        d = foreach c generate a::x as x, y, z;
        store d into 'file:///tmp/output';
        
      1. PIG-3500-2.patch
        70 kB
        Cheolsoo Park
      2. PIG-3500-1.patch
        71 kB
        Cheolsoo Park

        Issue Links

          Activity

          Hide
          cheolsoo Cheolsoo Park added a comment -

          Committed to tez branch.

          Show
          cheolsoo Cheolsoo Park added a comment - Committed to tez branch.
          Hide
          cheolsoo Cheolsoo Park added a comment -

          Thank you very much for the review! Yes, I agree that we can handle combiner per input/output in a separate jira. If there's no objection, I will commit my patch tonight after fixing Mark's other two comments.

          Show
          cheolsoo Cheolsoo Park added a comment - Thank you very much for the review! Yes, I agree that we can handle combiner per input/output in a separate jira. If there's no objection, I will commit my patch tonight after fixing Mark's other two comments.
          Hide
          mwagner Mark Wagner added a comment -

          Looks pretty good. The only major thing that stuck out is that for Tez, the special treatment of shuffles (Custom partitioners and combiners) needs to be associated with the inputs/outputs and not the TezOper since there can be many inputs/outputs. In the interest of progress on the Tez branch, I think it's best to commit this (with a couple small changes I noted and whatever others comment on) and handle the multi-combiner logic later. Do others have any problem with that?

          Show
          mwagner Mark Wagner added a comment - Looks pretty good. The only major thing that stuck out is that for Tez, the special treatment of shuffles (Custom partitioners and combiners) needs to be associated with the inputs/outputs and not the TezOper since there can be many inputs/outputs. In the interest of progress on the Tez branch, I think it's best to commit this (with a couple small changes I noted and whatever others comment on) and handle the multi-combiner logic later. Do others have any problem with that?
          Hide
          cheolsoo Cheolsoo Park added a comment -
          Show
          cheolsoo Cheolsoo Park added a comment - ReviewBoard: https://reviews.apache.org/r/14504/
          Hide
          cheolsoo Cheolsoo Park added a comment - - edited

          Attached includes an initial version of TezCompiler with unit tests. Note that query #3 is compiled into 3 Tez vertices (two input vertices and one join vertex) unlike MR plan.

          The unit test can run with ant test clean -Dtestcase=TestTezCompiler.

          Show
          cheolsoo Cheolsoo Park added a comment - - edited Attached includes an initial version of TezCompiler with unit tests. Note that query #3 is compiled into 3 Tez vertices (two input vertices and one join vertex) unlike MR plan. The unit test can run with ant test clean -Dtestcase=TestTezCompiler.

            People

            • Assignee:
              cheolsoo Cheolsoo Park
              Reporter:
              cheolsoo Cheolsoo Park
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development