Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1102

Collect number of spills per job

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None

      Description

      Memory shortage is one of the main performance issues in Pig. Knowing when we spill do the disk is useful for understanding query performance and also to see how certain changes in Pig effect that.

      Other interesting stats to collect would be average CPU usage and max mem usage but I am not sure if this information is easily retrievable.

      Using Hadoop counters for this would make sense.

        Attachments

        1. PIG_1102.patch.1
          11 kB
          Sriranjan Manjunath
        2. PIG_1102.patch
          10 kB
          Sriranjan Manjunath

          Activity

            People

            • Assignee:
              sriranjan Sriranjan Manjunath
              Reporter:
              olgan Olga Natkovich
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: