Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-683

TestJobTrackerRestart fails with Map task completion events ordering mismatch

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: jobtracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      TestJobTrackerRestart failed because of stale filemanager cache (which was created once per jvm). This patch makes sure that the filemanager is inited upon every JobHistory.init() and hence upon every restart. Note that this wont happen in production as upon a restart the new jobtracker will start in a new jvm and hence a new cache will be created.
      Show
      TestJobTrackerRestart failed because of stale filemanager cache (which was created once per jvm). This patch makes sure that the filemanager is inited upon every JobHistory.init() and hence upon every restart. Note that this wont happen in production as upon a restart the new jobtracker will start in a new jvm and hence a new cache will be created.
    • Tags:
      ygridqa

      Description

      TestJobTrackerRestart fails consistently with Map task completion events ordering mismatch error.

      1. MAPREDUCE-683-v1.0.patch
        0.8 kB
        Amar Kamat
      2. MAPREDUCE-683-v1.2.1-branch-0.20.patch
        2 kB
        Amar Kamat
      3. MAPREDUCE-683-v1.2.patch
        2 kB
        Amar Kamat
      4. MAPREDUCE-683-v1.2-branch-0.20.patch
        2 kB
        Amar Kamat
      5. TEST-org.apache.hadoop.mapred.TestJobTrackerRestart.txt
        255 kB
        Sreekanth Ramakrishnan

        Issue Links

          Activity

          Sreekanth Ramakrishnan created issue -
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching test log from trunk on the local machine.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching test log from trunk on the local machine.
          Sreekanth Ramakrishnan made changes -
          Field Original Value New Value
          Attachment TEST-org.apache.hadoop.mapred.TestJobTrackerRestart.txt [ 12412152 ]
          Amar Kamat made changes -
          Assignee Amar Kamat [ amar_kamat ]
          Hide
          Amar Kamat added a comment -

          Attaching a patch the fixes the testcase issue. The bug was because of the stale cache entries in the FileManager.

          Show
          Amar Kamat added a comment - Attaching a patch the fixes the testcase issue. The bug was because of the stale cache entries in the FileManager.
          Amar Kamat made changes -
          Attachment MAPREDUCE-683-v1.0.patch [ 12412196 ]
          Amar Kamat made changes -
          Link This issue blocks MAPREDUCE-237 [ MAPREDUCE-237 ]
          Hide
          Amar Kamat added a comment -

          This issue occurs only in testcases which involves restart. This will not happen in production/real clusters.

          Show
          Amar Kamat added a comment - This issue occurs only in testcases which involves restart. This will not happen in production/real clusters.
          Hide
          Hemanth Yamijala added a comment -

          I looked at the patch. I think it is more natural to simply reset the JobHistoryFileManager to a new instance in JobHistory.init(), instead of defining a new API now. But this apparently leads to failures in unrelated test cases - because JobHistory is enabled by default. My suggestion would be to disable it for test cases and enable only in init.

          Show
          Hemanth Yamijala added a comment - I looked at the patch. I think it is more natural to simply reset the JobHistoryFileManager to a new instance in JobHistory.init(), instead of defining a new API now. But this apparently leads to failures in unrelated test cases - because JobHistory is enabled by default. My suggestion would be to disable it for test cases and enable only in init.
          Hide
          Amar Kamat added a comment -

          Attaching a new patch incorporating Hemanth's comments. Result of test-patch
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

          Only TestLostTracker and TestMiniMRMapRedDebugScript failed on mapred. Both are known issues. TestStreamingExitStatus, TestStreamingStderr and TestQueueCapacities failed on contrib. These are also known issues.

          Show
          Amar Kamat added a comment - Attaching a new patch incorporating Hemanth's comments. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Only TestLostTracker and TestMiniMRMapRedDebugScript failed on mapred. Both are known issues. TestStreamingExitStatus, TestStreamingStderr and TestQueueCapacities failed on contrib. These are also known issues.
          Amar Kamat made changes -
          Attachment MAPREDUCE-683-v1.2.patch [ 12412256 ]
          Nigel Daley made changes -
          Tags ygridqa
          Hide
          Devaraj Das added a comment -

          I just committed this. Thanks, Amar!

          Show
          Devaraj Das added a comment - I just committed this. Thanks, Amar!
          Devaraj Das made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Devaraj Das made changes -
          Fix Version/s 0.21.0 [ 12314045 ]
          Affects Version/s 0.21.0 [ 12314045 ]
          Hide
          Amar Kamat added a comment -

          Attaching an example patch not to be committed to branch 0.20.

          Show
          Amar Kamat added a comment - Attaching an example patch not to be committed to branch 0.20.
          Amar Kamat made changes -
          Attachment MAPREDUCE-683-v1.2-branch-0.20.patch [ 12412742 ]
          Hide
          Amar Kamat added a comment -

          Sorry attached the wrong patch. MAPREDUCE-683-v1.2.1-branch-0.20.patch is the correct version for branch 0.20. This is an example patch not to be committed.

          Show
          Amar Kamat added a comment - Sorry attached the wrong patch. MAPREDUCE-683 -v1.2.1-branch-0.20.patch is the correct version for branch 0.20. This is an example patch not to be committed.
          Amar Kamat made changes -
          Attachment MAPREDUCE-683-v1.2.1-branch-0.20.patch [ 12412743 ]
          Amar Kamat made changes -
          Release Note TestJobTrackerRestart failed because of stale filemanager cache (which was created once per jvm). This patch makes sure that the filemanager is inited upon every JobHistory.init() and hence upon every restart. Note that this wont happen in production as upon a restart the new jobtracker will start in a new jvm and hence a new cache will be created.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #15 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/15/)
          . Fixes an initialization problem in the JobHistory. The initialization of JobHistoryFilesManager is now done in the JobHistory.init call. Contributed by Amar Kamat.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #15 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/15/ ) . Fixes an initialization problem in the JobHistory. The initialization of JobHistoryFilesManager is now done in the JobHistory.init call. Contributed by Amar Kamat.
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gavin made changes -
          Link This issue blocks MAPREDUCE-237 [ MAPREDUCE-237 ]
          Gavin made changes -
          Link This issue is depended upon by MAPREDUCE-237 [ MAPREDUCE-237 ]

            People

            • Assignee:
              Amar Kamat
              Reporter:
              Sreekanth Ramakrishnan
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development