Pig
  1. Pig
  2. PIG-1102

Collect number of spills per job

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None

      Description

      Memory shortage is one of the main performance issues in Pig. Knowing when we spill do the disk is useful for understanding query performance and also to see how certain changes in Pig effect that.

      Other interesting stats to collect would be average CPU usage and max mem usage but I am not sure if this information is easily retrievable.

      Using Hadoop counters for this would make sense.

      1. PIG_1102.patch.1
        11 kB
        Sriranjan Manjunath
      2. PIG_1102.patch
        10 kB
        Sriranjan Manjunath

        Activity

          People

          • Assignee:
            Sriranjan Manjunath
            Reporter:
            Olga Natkovich
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development