Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-5356

Add keyword on reducing tasks to allow combiner hints

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Create a keyword that would allow the programmer to tell PIG that a particular DISTINCT/GROUP BY action should disable the combiner.

       

      Often we have pieces of code to ensure that data is meeting expectations, even if we are pretty sure it already is. One example, is when we do a DISTINCT on data to ensure we do not have duplicates or we re-GROUP data to a certain grain before a join. In these cases, the combiner is taking extra time, but not actually giving us a benefit. We can gain significant performance improvement in this cases if the combiner is simply not run. In some cases we can do this at the job level, but in others, we may only want to have the combiner shut off for particular statements.

      Attachments

        Activity

          People

            Unassigned Unassigned
            HaleyThrapp Haley Thrapp
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: