Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The following is the scenario:
      -> TT got a LaunchTaskAction. And It found the slot free. It started the TaskRunner
      -> TT received KillJobActtion. It did a runner.kill().
      But, runner.kill() didn't remove JVM Runner from jvmIdToRunner , as it is not polulated yet.
      It released the slots.
      -> TaskRunner thread spawned jvm and populated jvmIdToRunner.
      -> The next task getting launched on the slot finds it as busy.

      1. 4967.patch
        2 kB
        Devaraj Das

        Activity

        Hide
        Amareshwari Sriramadasu added a comment -

        The logs for the above scenario:

        2008-12-31 06:12:11,255 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0001_m_-700127208
        2008-12-31 06:12:11,266 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0001_m_001155_0
        2008-12-31 06:12:11,286 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0001_m_-700127208 spawned.
        2008-12-31 06:12:11,305 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200812310557_0001_m_001155_0
        2008-12-31 06:12:20,678 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200812310557_0001
        2008-12-31 06:12:20,678 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_000317_0 done; removing files.
        2008-12-31 06:12:20,681 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_000317_0 not found in cache
        2008-12-31 06:12:20,682 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 1
        2008-12-31 06:12:20,682 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_001040_0 done; removing files.
        2008-12-31 06:12:20,685 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_001040_0 not found in cache
        2008-12-31 06:12:20,685 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
        2008-12-31 06:12:20,686 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_001155_0 done; removing files.
        2008-12-31 06:12:20,688 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_001155_0 not found in cache
        2008-12-31 06:12:20,688 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_000136_0 done; removing files.
        2008-12-31 06:12:20,691 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_000136_0 not found in cache
        2008-12-31 06:12:20,692 WARN org.apache.hadoop.mapred.TaskTracker: Unknown child task finshed: attempt_200812310557_0001_m_001040_0. Ignored.
        2008-12-31 06:12:20,693 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0001_m_-1030395660
        2008-12-31 06:12:20,693 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0001_m_-1030395660 spawned.
        2008-12-31 06:12:20,694 WARN org.apache.hadoop.mapred.TaskTracker: Unknown child task finshed: attempt_200812310557_0001_m_001155_0. Ignored.
        2008-12-31 06:12:28,532 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200812310557_0002_m_000525_0 task's state:UNASSIGNED
        2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200812310557_0002_m_000562_0 task's state:UNASSIGNED
        2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0002_m_000525_0
        2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_200812310557_0002_m_0
        00525_0
        2008-12-31 06:12:29,576 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_-700127208 exited. Number of tasks it ran: 0
        2008-12-31 06:12:29,790 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_540025796 exited. Number of tasks it ran: 0
        2008-12-31 06:12:48,271 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0002_m_000562_0
        2008-12-31 06:12:48,272 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200812310557_0002_m_000562_0
        2008-12-31 06:12:48,290 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0002_m_-612394729
        2008-12-31 06:12:48,292 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0002_m_-612394729 spawned.
        2008-12-31 06:12:52,223 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0002_m_000562_0 Child Error
        java.lang.RuntimeException: Inconsistent state!!! JVM Manager reached an unstable state while reaping a JVM for task: attempt_200812310557_0002_m_000562_0 Number of active JVMs:2
        JVMId jvm_200812310557_0001_m_-1030395660 #Tasks ran: 0 Currently busy? true Currently running: attempt_200812310557_0001_m_001155_0
        JVMId jvm_200812310557_0002_m_-612394729 #Tasks ran: 0 Currently busy? true Currently running: attempt_200812310557_0002_m_000525_0
        at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:253)
        at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:114)
        at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:77)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:411)
        2008-12-31 06:12:53,518 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_-1030395660 exited. Number of tasks it ran: 0

        Show
        Amareshwari Sriramadasu added a comment - The logs for the above scenario: 2008-12-31 06:12:11,255 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0001_m_-700127208 2008-12-31 06:12:11,266 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0001_m_001155_0 2008-12-31 06:12:11,286 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0001_m_-700127208 spawned. 2008-12-31 06:12:11,305 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200812310557_0001_m_001155_0 2008-12-31 06:12:20,678 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200812310557_0001 2008-12-31 06:12:20,678 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_000317_0 done; removing files. 2008-12-31 06:12:20,681 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_000317_0 not found in cache 2008-12-31 06:12:20,682 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 1 2008-12-31 06:12:20,682 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_001040_0 done; removing files. 2008-12-31 06:12:20,685 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_001040_0 not found in cache 2008-12-31 06:12:20,685 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2 2008-12-31 06:12:20,686 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_001155_0 done; removing files. 2008-12-31 06:12:20,688 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_001155_0 not found in cache 2008-12-31 06:12:20,688 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0001_m_000136_0 done; removing files. 2008-12-31 06:12:20,691 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200812310557_0001_m_000136_0 not found in cache 2008-12-31 06:12:20,692 WARN org.apache.hadoop.mapred.TaskTracker: Unknown child task finshed: attempt_200812310557_0001_m_001040_0. Ignored. 2008-12-31 06:12:20,693 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0001_m_-1030395660 2008-12-31 06:12:20,693 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0001_m_-1030395660 spawned. 2008-12-31 06:12:20,694 WARN org.apache.hadoop.mapred.TaskTracker: Unknown child task finshed: attempt_200812310557_0001_m_001155_0. Ignored. 2008-12-31 06:12:28,532 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200812310557_0002_m_000525_0 task's state:UNASSIGNED 2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200812310557_0002_m_000562_0 task's state:UNASSIGNED 2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0002_m_000525_0 2008-12-31 06:12:28,533 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_200812310557_0002_m_0 00525_0 2008-12-31 06:12:29,576 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_-700127208 exited. Number of tasks it ran: 0 2008-12-31 06:12:29,790 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_540025796 exited. Number of tasks it ran: 0 2008-12-31 06:12:48,271 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200812310557_0002_m_000562_0 2008-12-31 06:12:48,272 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200812310557_0002_m_000562_0 2008-12-31 06:12:48,290 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_200812310557_0002_m_-612394729 2008-12-31 06:12:48,292 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200812310557_0002_m_-612394729 spawned. 2008-12-31 06:12:52,223 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200812310557_0002_m_000562_0 Child Error java.lang.RuntimeException: Inconsistent state!!! JVM Manager reached an unstable state while reaping a JVM for task: attempt_200812310557_0002_m_000562_0 Number of active JVMs:2 JVMId jvm_200812310557_0001_m_-1030395660 #Tasks ran: 0 Currently busy? true Currently running: attempt_200812310557_0001_m_001155_0 JVMId jvm_200812310557_0002_m_-612394729 #Tasks ran: 0 Currently busy? true Currently running: attempt_200812310557_0002_m_000525_0 at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:253) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:114) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:77) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:411) 2008-12-31 06:12:53,518 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200812310557_0001_m_-1030395660 exited. Number of tasks it ran: 0
        Hide
        Devaraj Das added a comment -

        Will upload a patch soon

        Show
        Devaraj Das added a comment - Will upload a patch soon
        Hide
        Devaraj Das added a comment -

        Straightforward patch..

        Show
        Devaraj Das added a comment - Straightforward patch..
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch looks fine.

        Show
        Amareshwari Sriramadasu added a comment - Patch looks fine.
        Hide
        Devaraj Das added a comment -

        I just committed this.

        Show
        Devaraj Das added a comment - I just committed this.

          People

          • Assignee:
            Devaraj Das
            Reporter:
            Amareshwari Sriramadasu
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development