Issue Details (XML | Word | Printable)

Key: HADOOP-4983
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Amareshwari Sriramadasu
Reporter: Owen O'Malley
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Job counters sometimes go down as tasks run without task failures

Created: 06/Jan/09 12:36 AM   Updated: 08/Jul/09 04:53 PM
Return to search
Component/s: None
Affects Version/s: 0.19.0
Fix Version/s: 0.18.3, 0.19.1

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works patch-4983.txt 2009-01-15 08:57 AM Amareshwari Sriramadasu 0.7 kB

Hadoop Flags: Reviewed
Resolution Date: 21/Jan/09 06:55 AM


 Description  « Hide
As tasks run, the counters seem to back up and move forward again. They always seem to be right when the task completes. I suspect this may have been introduced in HADOOP-2208.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Amareshwari Sriramadasu added a comment - 07/Jan/09 06:19 AM
This happens, because TaskInProgress replaces counters in every status update and sometimes, status update doesnot have counters.
The code doing the same is in the method, TaskInProgress.recomputeProgress() :
} else if (status.getRunState() == TaskStatus.State.RUNNING) {
          if (status.getProgress() >= bestProgress) {
            bestProgress = status.getProgress();
            bestState = status.getStateString();
            bestCounters = status.getCounters();
          }
        }
      }

Amareshwari Sriramadasu added a comment - 15/Jan/09 08:57 AM
Patch with the fix

Amareshwari Sriramadasu added a comment - 15/Jan/09 09:49 AM

I monitored 3 long running jobs with long running tasks, whose counters oscillate over the run of the job, without the patch.
And with patch applied the jobs had incrementing counters, they never went down.

test-patch result :

     [exec]
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
     [exec]

It is not easy to write a test-case for this.

All core and contrib unit tests passed on my machine.

The same patch applies to 0.19, 0.20 and trunk


Devaraj Das added a comment - 21/Jan/09 06:55 AM
I just committed this. Thanks, Amareshwari!