Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-683

TestJobTrackerRestart fails with Map task completion events ordering mismatch

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: jobtracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      TestJobTrackerRestart failed because of stale filemanager cache (which was created once per jvm). This patch makes sure that the filemanager is inited upon every JobHistory.init() and hence upon every restart. Note that this wont happen in production as upon a restart the new jobtracker will start in a new jvm and hence a new cache will be created.
      Show
      TestJobTrackerRestart failed because of stale filemanager cache (which was created once per jvm). This patch makes sure that the filemanager is inited upon every JobHistory.init() and hence upon every restart. Note that this wont happen in production as upon a restart the new jobtracker will start in a new jvm and hence a new cache will be created.
    • Tags:
      ygridqa

      Description

      TestJobTrackerRestart fails consistently with Map task completion events ordering mismatch error.

      1. TEST-org.apache.hadoop.mapred.TestJobTrackerRestart.txt
        255 kB
        Sreekanth Ramakrishnan
      2. MAPREDUCE-683-v1.0.patch
        0.8 kB
        Amar Kamat
      3. MAPREDUCE-683-v1.2.patch
        2 kB
        Amar Kamat
      4. MAPREDUCE-683-v1.2-branch-0.20.patch
        2 kB
        Amar Kamat
      5. MAPREDUCE-683-v1.2.1-branch-0.20.patch
        2 kB
        Amar Kamat

        Issue Links

          Activity

          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching test log from trunk on the local machine.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching test log from trunk on the local machine.
          Hide
          Amar Kamat added a comment -

          Attaching a patch the fixes the testcase issue. The bug was because of the stale cache entries in the FileManager.

          Show
          Amar Kamat added a comment - Attaching a patch the fixes the testcase issue. The bug was because of the stale cache entries in the FileManager.
          Hide
          Amar Kamat added a comment -

          This issue occurs only in testcases which involves restart. This will not happen in production/real clusters.

          Show
          Amar Kamat added a comment - This issue occurs only in testcases which involves restart. This will not happen in production/real clusters.
          Hide
          Hemanth Yamijala added a comment -

          I looked at the patch. I think it is more natural to simply reset the JobHistoryFileManager to a new instance in JobHistory.init(), instead of defining a new API now. But this apparently leads to failures in unrelated test cases - because JobHistory is enabled by default. My suggestion would be to disable it for test cases and enable only in init.

          Show
          Hemanth Yamijala added a comment - I looked at the patch. I think it is more natural to simply reset the JobHistoryFileManager to a new instance in JobHistory.init(), instead of defining a new API now. But this apparently leads to failures in unrelated test cases - because JobHistory is enabled by default. My suggestion would be to disable it for test cases and enable only in init.
          Hide
          Amar Kamat added a comment -

          Attaching a new patch incorporating Hemanth's comments. Result of test-patch
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

          Only TestLostTracker and TestMiniMRMapRedDebugScript failed on mapred. Both are known issues. TestStreamingExitStatus, TestStreamingStderr and TestQueueCapacities failed on contrib. These are also known issues.

          Show
          Amar Kamat added a comment - Attaching a new patch incorporating Hemanth's comments. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Only TestLostTracker and TestMiniMRMapRedDebugScript failed on mapred. Both are known issues. TestStreamingExitStatus, TestStreamingStderr and TestQueueCapacities failed on contrib. These are also known issues.
          Hide
          Devaraj Das added a comment -

          I just committed this. Thanks, Amar!

          Show
          Devaraj Das added a comment - I just committed this. Thanks, Amar!
          Hide
          Amar Kamat added a comment -

          Attaching an example patch not to be committed to branch 0.20.

          Show
          Amar Kamat added a comment - Attaching an example patch not to be committed to branch 0.20.
          Hide
          Amar Kamat added a comment -

          Sorry attached the wrong patch. MAPREDUCE-683-v1.2.1-branch-0.20.patch is the correct version for branch 0.20. This is an example patch not to be committed.

          Show
          Amar Kamat added a comment - Sorry attached the wrong patch. MAPREDUCE-683 -v1.2.1-branch-0.20.patch is the correct version for branch 0.20. This is an example patch not to be committed.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #15 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/15/)
          . Fixes an initialization problem in the JobHistory. The initialization of JobHistoryFilesManager is now done in the JobHistory.init call. Contributed by Amar Kamat.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #15 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/15/ ) . Fixes an initialization problem in the JobHistory. The initialization of JobHistoryFilesManager is now done in the JobHistory.init call. Contributed by Amar Kamat.

            People

            • Assignee:
              Amar Kamat
              Reporter:
              Sreekanth Ramakrishnan
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development