Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4843

Turn off combiner in reducer vertex for Tez if bags are in combine plan

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.16.0
    • None
    • None
    • Reviewed

    Description

      B = group A by key;
      C = foreach B {
                                               key_value           =  A.key_value;
                                               distinct_key_value  = DISTINCT key_value;
                                               generate group, MIN(A.key_value) as min_value, MAX(A.key_value) as max_value, COUNT(distinct_key_value) as distinct_values;
                          }
      

      In the above example, the combine plan holds the Distinct bag and it causes OOM when combiner is run by the MergeManager in the reducer. We did not have this issue with mapreduce as combiner is not running in reducer for new API till now (MAPREDUCE-5221)

      Attachments

        1. PIG-4843-1.patch
          9 kB
          Rohini Palaniswamy

        Activity

          People

            rohini Rohini Palaniswamy
            rohini Rohini Palaniswamy
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: