Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1233

Incorrect Waiting maps/reduces in Jobtracker metrics

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.203.0
    • Component/s: jobtracker
    • Labels:
      None

      Description

      Waiting Maps/Reduces are incorrect in Jobtracker metrics when a job fails. when a map/reduce fails(during job failure), waiting maps/reduce got incremented and doesn't get decremented even after job cleanup.

      1. mr-1233-y20s-v2.patch
        1 kB
        Luke Lu
      2. mr-1233-y20s-v1.patch
        0.9 kB
        Luke Lu

        Activity

        Hide
        Luke Lu added a comment -

        Patch for y20.200 branch. Trunk doesn't seem to need the patch due to MAPREDUCE-1152, where killed tasks don't cause waiting tasks to be incremented.

        Show
        Luke Lu added a comment - Patch for y20.200 branch. Trunk doesn't seem to need the patch due to MAPREDUCE-1152 , where killed tasks don't cause waiting tasks to be incremented.
        Hide
        Luke Lu added a comment -

        v2 patch uses a more stable method to check for metrics garbage-collectedness.

        Show
        Luke Lu added a comment - v2 patch uses a more stable method to check for metrics garbage-collectedness.
        Hide
        Koji Noguchi added a comment -

        This seems to be in 1.X/0.20.2XX already?

        However, I'm not sure how the patch is related to "Waiting" metrics. This only touches failedMap&failedReduce metrics.

        Show
        Koji Noguchi added a comment - This seems to be in 1.X/0.20.2XX already? However, I'm not sure how the patch is related to "Waiting" metrics. This only touches failedMap&failedReduce metrics.
        Hide
        Koji Noguchi added a comment -

        However, I'm not sure how the patch is related to "Waiting" metrics. This only touches failedMap&failedReduce metrics.

        Please ignore this comment. I now see that metrics.failedMap()/failedReduce() calls updates the waiting metrics inside.
        So we can close this as fixed/committed?
        (Although we are still seeing incorrect metrics reported in MAPREDUCE-1238)

        Show
        Koji Noguchi added a comment - However, I'm not sure how the patch is related to "Waiting" metrics. This only touches failedMap&failedReduce metrics. Please ignore this comment. I now see that metrics.failedMap()/failedReduce() calls updates the waiting metrics inside. So we can close this as fixed/committed? (Although we are still seeing incorrect metrics reported in MAPREDUCE-1238 )
        Hide
        Thomas Graves added a comment -

        Yes, Koji is right, this patch was integrated into branch-1/20s in revision 1077665 (branches/branch-1.0/src/mapred/org/apache/hadoop/mapred/JobInProgress.java) - which according to changes.txt would have went into 0.20.202.0.

        Moving this to resolved and can fix issue under MAPREDUCE-1238.

        Show
        Thomas Graves added a comment - Yes, Koji is right, this patch was integrated into branch-1/20s in revision 1077665 (branches/branch-1.0/src/mapred/org/apache/hadoop/mapred/JobInProgress.java) - which according to changes.txt would have went into 0.20.202.0. Moving this to resolved and can fix issue under MAPREDUCE-1238 .

          People

          • Assignee:
            Luke Lu
            Reporter:
            V.Karthikeyan
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development