Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.1
    • Component/s: jobtracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      JobTracker was changed to take an identifier as an argument. This helps in testcases where the jobtracker/mapred-cluster is (re)started in a short span of time and the chances of jobtracker identifier clashing are high. Also the RecoveryManager was modified to throw an exception if a job fails in init during the recovery process. The reason being that this event will trigger a job failure in the recovery process and will remove the failed job from further initialization and processing.
      Show
      JobTracker was changed to take an identifier as an argument. This helps in testcases where the jobtracker/mapred-cluster is (re)started in a short span of time and the chances of jobtracker identifier clashing are high. Also the RecoveryManager was modified to throw an exception if a job fails in init during the recovery process. The reason being that this event will trigger a job failure in the recovery process and will remove the failed job from further initialization and processing.
    1. TEST-org.apache.hadoop.mapred.TestRecoveryManager.txt
      564 kB
      Amareshwari Sriramadasu
    2. TEST-org.apache.hadoop.mapred.TestRecoveryManager.txt
      543 kB
      Amareshwari Sriramadasu
    3. TEST-org.apache.hadoop.mapred.TestRecoveryManager.txt
      618 kB
      Amareshwari Sriramadasu
    4. MAPREDUCE-745-v1.8-branch-0.20.patch
      5 kB
      Amar Kamat
    5. MAPREDUCE-745-v1.8.patch
      8 kB
      Amar Kamat
    6. MAPREDUCE-745-v1.8.1-branch-0.20.patch
      5 kB
      Amar Kamat
    7. MAPREDUCE-745-v1.7.patch
      9 kB
      Amar Kamat
    8. MAPREDUCE-745-v1.3.patch
      3 kB
      Amar Kamat
    9. MAPREDUCE-745-v1.2.patch
      3 kB
      Amar Kamat
    10. MAPREDUCE-745-v1.0.patch
      2 kB
      Amar Kamat
    11. mapred-745-yahoo-internal.patch
      1 kB
      Amar Kamat

      Activity

      Amareshwari Sriramadasu created issue -
      Hide
      Amareshwari Sriramadasu added a comment -

      attaching test failure log.

      Show
      Amareshwari Sriramadasu added a comment - attaching test failure log.
      Amareshwari Sriramadasu made changes -
      Field Original Value New Value
      Attachment TEST-org.apache.hadoop.mapred.TestRecoveryManager.txt [ 12413089 ]
      Hide
      Amar Kamat added a comment -

      Attaching an example patch. The reason for failure is that MAPREDUCE-463 changes the job initialization code in RecoveryManager but failed to throw Exception upon init failure.

      Show
      Amar Kamat added a comment - Attaching an example patch. The reason for failure is that MAPREDUCE-463 changes the job initialization code in RecoveryManager but failed to throw Exception upon init failure.
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.0.patch [ 12413110 ]
      Hide
      Amareshwari Sriramadasu added a comment -

      Saw a different test failure in TestRecoveryManager

      Show
      Amareshwari Sriramadasu added a comment - Saw a different test failure in TestRecoveryManager
      Amareshwari Sriramadasu made changes -
      Hide
      Amar Kamat added a comment -

      Looks like the issue is to do with one jobtracker is going down while the other comes up. The jobtracker that is getting shutdown tries to stop the eager-task-initializer from initializing a job which inturn disables history. The new jobtracker kindof sees the disabled history and disables recovery and the testcase fails. The only thing that doesnt make sense is that minimr.stopJobTracker() calls jobtracker.close() which makes sure that all the threads are closed/stopped/terminated and does a join. How can two jobtracker instances be active at the same time?

      Show
      Amar Kamat added a comment - Looks like the issue is to do with one jobtracker is going down while the other comes up. The jobtracker that is getting shutdown tries to stop the eager-task-initializer from initializing a job which inturn disables history. The new jobtracker kindof sees the disabled history and disables recovery and the testcase fails. The only thing that doesnt make sense is that minimr.stopJobTracker() calls jobtracker.close() which makes sure that all the threads are closed/stopped/terminated and does a join. How can two jobtracker instances be active at the same time?
      Hide
      Amar Kamat added a comment -

      Looked at the logs carefully. We are stuck by HDFS-53. Both RecoveryManager and EagerTaskInitializer compete with each other to init jobs and while one is busy moving files to the done folder, the other encounters an exception.

      Show
      Amar Kamat added a comment - Looked at the logs carefully. We are stuck by HDFS-53 . Both RecoveryManager and EagerTaskInitializer compete with each other to init jobs and while one is busy moving files to the done folder, the other encounters an exception.
      Hide
      Amar Kamat added a comment -

      Attaching a patch that fixes the common issue to do with jobtracker instances coming up in same minute. Result of test-patch
      [exec] -1 overall.
      [exec]
      [exec] +1 @author. The patch does not contain any @author tags.
      [exec]
      [exec] +1 tests included. The patch appears to include 3 new or modified tests.
      [exec]
      [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
      [exec]
      [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
      [exec]
      [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.
      [exec]
      [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

      The findbug warning is due to refactoring.

      Show
      Amar Kamat added a comment - Attaching a patch that fixes the common issue to do with jobtracker instances coming up in same minute. Result of test-patch [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. The findbug warning is due to refactoring.
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.2.patch [ 12414894 ]
      Hide
      Amar Kamat added a comment -

      Attaching a patch that makes the jobtracker identifier unque for testcases. Result of test-patch
      [exec] -1 overall.
      [exec]
      [exec] +1 @author. The patch does not contain any @author tags.
      [exec]
      [exec] +1 tests included. The patch appears to include 3 new or modified tests.
      [exec]
      [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
      [exec]
      [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
      [exec]
      [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.
      [exec]
      [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

      The findbugs warning is due to refactoring.

      Show
      Amar Kamat added a comment - Attaching a patch that makes the jobtracker identifier unque for testcases. Result of test-patch [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. The findbugs warning is due to refactoring.
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.3.patch [ 12415311 ]
      Show
      Amareshwari Sriramadasu added a comment - Observed a test timeout in one of the hudson builds. @ http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/460/testReport/org.apache.hadoop.mapred/TestRecoveryManager/testJobTracker
      Hide
      Amareshwari Sriramadasu added a comment -

      Saw different assertion failure for TestRecoveryManager

      Show
      Amareshwari Sriramadasu added a comment - Saw different assertion failure for TestRecoveryManager
      Amareshwari Sriramadasu made changes -
      Hide
      Amar Kamat added a comment -

      Attaching a patch that provides a way to pass identifier to jobtracker. Result of test-patch
      [exec] +1 overall.
      [exec]
      [exec] +1 @author. The patch does not contain any @author tags.
      [exec]
      [exec] +1 tests included. The patch appears to include 9 new or modified tests.
      [exec]
      [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
      [exec]
      [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
      [exec]
      [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
      [exec]
      [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

      Running ant tests.

      Show
      Amar Kamat added a comment - Attaching a patch that provides a way to pass identifier to jobtracker. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 9 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Running ant tests.
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.7.patch [ 12416543 ]
      Hide
      Amar Kamat added a comment -

      ant tests (mapred+contrib) passed.

      Show
      Amar Kamat added a comment - ant tests (mapred+contrib) passed.
      Hide
      Amar Kamat added a comment -

      Attaching a patch for trunk with a minor change. Changes to JobTracker.getDateFormat() are reverted.

      Show
      Amar Kamat added a comment - Attaching a patch for trunk with a minor change. Changes to JobTracker.getDateFormat() are reverted.
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.8.patch [ 12417124 ]
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.8.patch [ 12417124 ]
      Hide
      Amar Kamat added a comment -

      Attaching patches for trunk and branch-0.20

      Show
      Amar Kamat added a comment - Attaching patches for trunk and branch-0.20
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.8.patch [ 12417125 ]
      Attachment MAPREDUCE-745-v1.8-branch-0.20.patch [ 12417126 ]
      Hide
      Amar Kamat added a comment -

      Patch for branch 0.20 with removed imports.

      Show
      Amar Kamat added a comment - Patch for branch 0.20 with removed imports.
      Amar Kamat made changes -
      Attachment MAPREDUCE-745-v1.8.1-branch-0.20.patch [ 12417136 ]
      Hide
      Devaraj Das added a comment -

      Yesterday, there was a problem with the machine on the Apache that I use for commits. Wasn't sure whether it got committed. But it actually did get committed.
      Thanks, Amar!

      Show
      Devaraj Das added a comment - Yesterday, there was a problem with the machine on the Apache that I use for commits. Wasn't sure whether it got committed. But it actually did get committed. Thanks, Amar!
      Hide
      Devaraj Das added a comment -

      I committed this. Thanks, Amar!

      Show
      Devaraj Das added a comment - I committed this. Thanks, Amar!
      Devaraj Das made changes -
      Status Open [ 1 ] Resolved [ 5 ]
      Hadoop Flags [Reviewed]
      Assignee Amar Kamat [ amar_kamat ]
      Fix Version/s 0.20.1 [ 12314047 ]
      Fix Version/s 0.21.0 [ 12314045 ]
      Resolution Fixed [ 1 ]
      Hide
      Amar Kamat added a comment -

      Attaching a patch for resolving conflicts when the yahoo-hadoop-distribution is rolled forwarded to MAPREDUCE-745 (aka SVN r806173).

      Show
      Amar Kamat added a comment - Attaching a patch for resolving conflicts when the yahoo-hadoop-distribution is rolled forwarded to MAPREDUCE-745 (aka SVN r806173).
      Amar Kamat made changes -
      Attachment mapred-745-yahoo-internal.patch [ 12417231 ]
      Amar Kamat made changes -
      Release Note JobTracker was changed to take an identifier as an argument. This helps in testcases where the jobtracker/mapred-cluster is (re)started in a short span of time and the chances of jobtracker identifier clashing are high. Also the RecoveryManager was modified to throw an exception if a job fails in init during the recovery process. The reason being that this event will trigger a job failure in the recovery process and will remove the failed job from further initialization and processing.
      Transition Time In Source Status Execution Times Last Executer Last Execution Date
      Open Open Resolved Resolved
      41d 21h 18m 1 Devaraj Das 21/Aug/09 06:00

        People

        • Assignee:
          Amar Kamat
          Reporter:
          Amareshwari Sriramadasu
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development