Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: tez-branch
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We don't have e2e tests for cases like union followed by group by, join (replicate, skewed, hash), orderby, limit, etc. PIG-3835 adds optimization to those cases and we should have e2e tests for that.

      1. PIG-3855-1.patch
        89 kB
        Rohini Palaniswamy
      2. PIG-3855-3.patch
        95 kB
        Rohini Palaniswamy

        Issue Links

          Activity

          Hide
          Rohini Palaniswamy added a comment -

          Committed to tez branch. Thanks Daniel and Cheolsoo for the review.

          Show
          Rohini Palaniswamy added a comment - Committed to tez branch. Thanks Daniel and Cheolsoo for the review.
          Hide
          Rohini Palaniswamy added a comment -

          Changes done:

          • Created a new input in TEZ-1003 and used that so that we can turn on UnionOptimizer by default. Without that seeing lot of performance degradation in production scripts.
          • Added lot of e2e tests for UnionOptimizer and fixed code based on the issues found.
          • Fixed couple of other minor issues like
          • default parallelism not honored
          • Serializing full store was causing problems with some UDFs on deserialize for checkOutputSpecs.

          This patch depends on TEZ-1003. So will check in once that is available as part of tez snapshot in maven.

          Show
          Rohini Palaniswamy added a comment - Changes done: Created a new input in TEZ-1003 and used that so that we can turn on UnionOptimizer by default. Without that seeing lot of performance degradation in production scripts. Added lot of e2e tests for UnionOptimizer and fixed code based on the issues found. Fixed couple of other minor issues like default parallelism not honored Serializing full store was causing problems with some UDFs on deserialize for checkOutputSpecs. This patch depends on TEZ-1003 . So will check in once that is available as part of tez snapshot in maven.
          Hide
          Rohini Palaniswamy added a comment -
          Show
          Rohini Palaniswamy added a comment - Review board - https://reviews.apache.org/r/20320

            People

            • Assignee:
              Rohini Palaniswamy
              Reporter:
              Rohini Palaniswamy
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development