Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-545

PERFORMANCE: Sampler for order bys does not produce a good distribution

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2.0
    • 0.2.0
    • impl
    • None
    • Reviewed

    Description

      In running tests on actual data, I've noticed that the final reduce of an order by has skewed partitions. Some reduces finish in a few seconds while some run for 20 minutes. Getting a better distribution should lead to much better performance for order by.

      Attachments

        1. WRP.patch
          33 kB
          Shravan Matthur Narayanamurthy
        2. WRP1.patch
          34 kB
          Shravan Matthur Narayanamurthy
        3. PIG-545-v3.patch
          60 kB
          Pradeep Kamath
        4. PIG-545-v4.patch
          60 kB
          Pradeep Kamath

        Activity

          People

            pkamath Pradeep Kamath
            gates Alan Gates
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: