Hadoop Common
  1. Hadoop Common
  2. HADOOP-4983

Job counters sometimes go down as tasks run without task failures

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.18.3, 0.19.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      As tasks run, the counters seem to back up and move forward again. They always seem to be right when the task completes. I suspect this may have been introduced in HADOOP-2208.

      1. patch-4983.txt
        0.7 kB
        Amareshwari Sriramadasu

        Activity

        Hide
        Devaraj Das added a comment -

        I just committed this. Thanks, Amareshwari!

        Show
        Devaraj Das added a comment - I just committed this. Thanks, Amareshwari!
        Hide
        Amareshwari Sriramadasu added a comment -

        I monitored 3 long running jobs with long running tasks, whose counters oscillate over the run of the job, without the patch.
        And with patch applied the jobs had incrementing counters, they never went down.

        test-patch result :

             [exec]
             [exec] -1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
             [exec]                         Please justify why no tests are needed for this patch.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec]
             [exec]
        

        It is not easy to write a test-case for this.

        All core and contrib unit tests passed on my machine.

        The same patch applies to 0.19, 0.20 and trunk

        Show
        Amareshwari Sriramadasu added a comment - I monitored 3 long running jobs with long running tasks, whose counters oscillate over the run of the job, without the patch. And with patch applied the jobs had incrementing counters, they never went down. test-patch result : [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] It is not easy to write a test-case for this. All core and contrib unit tests passed on my machine. The same patch applies to 0.19, 0.20 and trunk
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch with the fix

        Show
        Amareshwari Sriramadasu added a comment - Patch with the fix
        Hide
        Amareshwari Sriramadasu added a comment -

        This happens, because TaskInProgress replaces counters in every status update and sometimes, status update doesnot have counters.
        The code doing the same is in the method, TaskInProgress.recomputeProgress() :

                } else if (status.getRunState() == TaskStatus.State.RUNNING) {
                  if (status.getProgress() >= bestProgress) {
                    bestProgress = status.getProgress();
                    bestState = status.getStateString();
                    bestCounters = status.getCounters();
                  }
                }
              }
        
        Show
        Amareshwari Sriramadasu added a comment - This happens, because TaskInProgress replaces counters in every status update and sometimes, status update doesnot have counters. The code doing the same is in the method, TaskInProgress.recomputeProgress() : } else if (status.getRunState() == TaskStatus.State.RUNNING) { if (status.getProgress() >= bestProgress) { bestProgress = status.getProgress(); bestState = status.getStateString(); bestCounters = status.getCounters(); } } }

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Owen O'Malley
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development