Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-490

Combiner not used when group elements referred to in tuple notation instead of flatten.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2.0
    • 0.9.0
    • None
    • None

    Description

      Given a query like:

      A = load 'myfile';
      B = group A by ($0, $1);
      C = foreach B generate group.$0, group.$1, COUNT(A);
      

      The combiner will not be invoked. But if the last line is changed to:

      C = foreach B generate flatten(group), COUNT(A);
      

      it will be. The reason for the discrepancy is because the CombinerOptimizer checks that all of the projections are simple. If not, it does not use the combiner. group.$0 is not a simple projection, so this is failed. However, this is a common enough case that the CombinerOptimizer should detect it and still use the combiner.

      Attachments

        Issue Links

          Activity

            People

              thejas Thejas Nair
              gates Alan Gates
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: