Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-792

PERFORMANCE: Support skewed join in pig

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.4.0
    • None
    • None

    Description

      Fragmented replicated join has a few limitations:

      • One of the tables needs to be loaded into memory
      • Join is limited to two tables

      Skewed join partitions the table and joins the records in the reduce phase. It computes a histogram of the key space to account for skewing in the input records. Further, it adjusts the number of reducers depending on the key distribution.

      We need to implement the skewed join in pig.

      Attachments

        1. skewedjoin.patch
          139 kB
          Sriranjan Manjunath

        Issue Links

          Activity

            People

              sriranjan Sriranjan Manjunath
              sriranjan Sriranjan Manjunath
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: