Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-49

optimize bag usage

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • None
    • None

    Description

      (1) Currently, we always bring the entire bag into memory even though in most cases we just need to stream through it. This is very inefficient in terms of memory and CPU usage.
      (2) If we are doing multiple computations on the same group, we iterate over the bag that represents the group several times. This is very inefficient especially for spilled bags.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              olgan Olga Natkovich
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: