Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1102

Collect number of spills per job

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.7.0
    • None
    • None

    Description

      Memory shortage is one of the main performance issues in Pig. Knowing when we spill do the disk is useful for understanding query performance and also to see how certain changes in Pig effect that.

      Other interesting stats to collect would be average CPU usage and max mem usage but I am not sure if this information is easily retrievable.

      Using Hadoop counters for this would make sense.

      Attachments

        1. PIG_1102.patch.1
          11 kB
          Sriranjan Manjunath
        2. PIG_1102.patch
          10 kB
          Sriranjan Manjunath

        Activity

          People

            sriranjan Sriranjan Manjunath
            olgan Olga Natkovich
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: