Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.1
    • Component/s: jobtracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      JobTracker was changed to take an identifier as an argument. This helps in testcases where the jobtracker/mapred-cluster is (re)started in a short span of time and the chances of jobtracker identifier clashing are high. Also the RecoveryManager was modified to throw an exception if a job fails in init during the recovery process. The reason being that this event will trigger a job failure in the recovery process and will remove the failed job from further initialization and processing.
      Show
      JobTracker was changed to take an identifier as an argument. This helps in testcases where the jobtracker/mapred-cluster is (re)started in a short span of time and the chances of jobtracker identifier clashing are high. Also the RecoveryManager was modified to throw an exception if a job fails in init during the recovery process. The reason being that this event will trigger a job failure in the recovery process and will remove the failed job from further initialization and processing.
    1. TEST-org.apache.hadoop.mapred.TestRecoveryManager.txt
      564 kB
      Amareshwari Sriramadasu
    2. TEST-org.apache.hadoop.mapred.TestRecoveryManager.txt
      543 kB
      Amareshwari Sriramadasu
    3. TEST-org.apache.hadoop.mapred.TestRecoveryManager.txt
      618 kB
      Amareshwari Sriramadasu
    4. MAPREDUCE-745-v1.8-branch-0.20.patch
      5 kB
      Amar Kamat
    5. MAPREDUCE-745-v1.8.patch
      8 kB
      Amar Kamat
    6. MAPREDUCE-745-v1.8.1-branch-0.20.patch
      5 kB
      Amar Kamat
    7. MAPREDUCE-745-v1.7.patch
      9 kB
      Amar Kamat
    8. MAPREDUCE-745-v1.3.patch
      3 kB
      Amar Kamat
    9. MAPREDUCE-745-v1.2.patch
      3 kB
      Amar Kamat
    10. MAPREDUCE-745-v1.0.patch
      2 kB
      Amar Kamat
    11. mapred-745-yahoo-internal.patch
      1 kB
      Amar Kamat

      Activity

      Hide
      Amareshwari Sriramadasu added a comment -

      attaching test failure log.

      Show
      Amareshwari Sriramadasu added a comment - attaching test failure log.
      Hide
      Amar Kamat added a comment -

      Attaching an example patch. The reason for failure is that MAPREDUCE-463 changes the job initialization code in RecoveryManager but failed to throw Exception upon init failure.

      Show
      Amar Kamat added a comment - Attaching an example patch. The reason for failure is that MAPREDUCE-463 changes the job initialization code in RecoveryManager but failed to throw Exception upon init failure.
      Hide
      Amareshwari Sriramadasu added a comment -

      Saw a different test failure in TestRecoveryManager

      Show
      Amareshwari Sriramadasu added a comment - Saw a different test failure in TestRecoveryManager
      Hide
      Amar Kamat added a comment -

      Looks like the issue is to do with one jobtracker is going down while the other comes up. The jobtracker that is getting shutdown tries to stop the eager-task-initializer from initializing a job which inturn disables history. The new jobtracker kindof sees the disabled history and disables recovery and the testcase fails. The only thing that doesnt make sense is that minimr.stopJobTracker() calls jobtracker.close() which makes sure that all the threads are closed/stopped/terminated and does a join. How can two jobtracker instances be active at the same time?

      Show
      Amar Kamat added a comment - Looks like the issue is to do with one jobtracker is going down while the other comes up. The jobtracker that is getting shutdown tries to stop the eager-task-initializer from initializing a job which inturn disables history. The new jobtracker kindof sees the disabled history and disables recovery and the testcase fails. The only thing that doesnt make sense is that minimr.stopJobTracker() calls jobtracker.close() which makes sure that all the threads are closed/stopped/terminated and does a join. How can two jobtracker instances be active at the same time?
      Hide
      Amar Kamat added a comment -

      Looked at the logs carefully. We are stuck by HDFS-53. Both RecoveryManager and EagerTaskInitializer compete with each other to init jobs and while one is busy moving files to the done folder, the other encounters an exception.

      Show
      Amar Kamat added a comment - Looked at the logs carefully. We are stuck by HDFS-53 . Both RecoveryManager and EagerTaskInitializer compete with each other to init jobs and while one is busy moving files to the done folder, the other encounters an exception.
      Hide
      Amar Kamat added a comment -

      Attaching a patch that fixes the common issue to do with jobtracker instances coming up in same minute. Result of test-patch
      [exec] -1 overall.
      [exec]
      [exec] +1 @author. The patch does not contain any @author tags.
      [exec]
      [exec] +1 tests included. The patch appears to include 3 new or modified tests.
      [exec]
      [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
      [exec]
      [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
      [exec]
      [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.
      [exec]
      [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

      The findbug warning is due to refactoring.

      Show
      Amar Kamat added a comment - Attaching a patch that fixes the common issue to do with jobtracker instances coming up in same minute. Result of test-patch [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. The findbug warning is due to refactoring.
      Hide
      Amar Kamat added a comment -

      Attaching a patch that makes the jobtracker identifier unque for testcases. Result of test-patch
      [exec] -1 overall.
      [exec]
      [exec] +1 @author. The patch does not contain any @author tags.
      [exec]
      [exec] +1 tests included. The patch appears to include 3 new or modified tests.
      [exec]
      [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
      [exec]
      [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
      [exec]
      [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.
      [exec]
      [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

      The findbugs warning is due to refactoring.

      Show
      Amar Kamat added a comment - Attaching a patch that makes the jobtracker identifier unque for testcases. Result of test-patch [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. The findbugs warning is due to refactoring.
      Show
      Amareshwari Sriramadasu added a comment - Observed a test timeout in one of the hudson builds. @ http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/460/testReport/org.apache.hadoop.mapred/TestRecoveryManager/testJobTracker
      Hide
      Amareshwari Sriramadasu added a comment -

      Saw different assertion failure for TestRecoveryManager

      Show
      Amareshwari Sriramadasu added a comment - Saw different assertion failure for TestRecoveryManager
      Hide
      Amar Kamat added a comment -

      Attaching a patch that provides a way to pass identifier to jobtracker. Result of test-patch
      [exec] +1 overall.
      [exec]
      [exec] +1 @author. The patch does not contain any @author tags.
      [exec]
      [exec] +1 tests included. The patch appears to include 9 new or modified tests.
      [exec]
      [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
      [exec]
      [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
      [exec]
      [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
      [exec]
      [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

      Running ant tests.

      Show
      Amar Kamat added a comment - Attaching a patch that provides a way to pass identifier to jobtracker. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 9 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Running ant tests.
      Hide
      Amar Kamat added a comment -

      ant tests (mapred+contrib) passed.

      Show
      Amar Kamat added a comment - ant tests (mapred+contrib) passed.
      Hide
      Amar Kamat added a comment -

      Attaching a patch for trunk with a minor change. Changes to JobTracker.getDateFormat() are reverted.

      Show
      Amar Kamat added a comment - Attaching a patch for trunk with a minor change. Changes to JobTracker.getDateFormat() are reverted.
      Hide
      Amar Kamat added a comment -

      Attaching patches for trunk and branch-0.20

      Show
      Amar Kamat added a comment - Attaching patches for trunk and branch-0.20
      Hide
      Amar Kamat added a comment -

      Patch for branch 0.20 with removed imports.

      Show
      Amar Kamat added a comment - Patch for branch 0.20 with removed imports.
      Hide
      Devaraj Das added a comment -

      Yesterday, there was a problem with the machine on the Apache that I use for commits. Wasn't sure whether it got committed. But it actually did get committed.
      Thanks, Amar!

      Show
      Devaraj Das added a comment - Yesterday, there was a problem with the machine on the Apache that I use for commits. Wasn't sure whether it got committed. But it actually did get committed. Thanks, Amar!
      Hide
      Devaraj Das added a comment -

      I committed this. Thanks, Amar!

      Show
      Devaraj Das added a comment - I committed this. Thanks, Amar!
      Hide
      Amar Kamat added a comment -

      Attaching a patch for resolving conflicts when the yahoo-hadoop-distribution is rolled forwarded to MAPREDUCE-745 (aka SVN r806173).

      Show
      Amar Kamat added a comment - Attaching a patch for resolving conflicts when the yahoo-hadoop-distribution is rolled forwarded to MAPREDUCE-745 (aka SVN r806173).

        People

        • Assignee:
          Amar Kamat
          Reporter:
          Amareshwari Sriramadasu
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development