Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Job initialization process was changed to not change (run) states during initialization. The reason is two fold
      - this can lead to deadlock as state changes require circular locking (i.e JobInProgress requires JobTracker lock)
      - events were not raised as these state changes were not informed/propogated back to the JobTracker

      Now the JobTracker takes care of initializing/failing/killing the job and raising appropriate events. The simple rule that was enforced was that "The JobTracker lock is *must* before changing the run-state of a job".
      Show
      Job initialization process was changed to not change (run) states during initialization. The reason is two fold - this can lead to deadlock as state changes require circular locking (i.e JobInProgress requires JobTracker lock) - events were not raised as these state changes were not informed/propogated back to the JobTracker Now the JobTracker takes care of initializing/failing/killing the job and raising appropriate events. The simple rule that was enforced was that "The JobTracker lock is *must* before changing the run-state of a job".

      Description

      We are running a hadoop cluster (version 0.20.0) and have detected the following deadlock on our jobtracker:

      "IPC Server handler 51 on 9001":
      	at org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:943)
      	- waiting to lock <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
      	at org.apache.hadoop.mapred.JobTracker.getJobCounters(JobTracker.java:3102)
      	- locked <0x00007f2b5f026000> (a org.apache.hadoop.mapred.JobTracker)
      	at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:396)
      	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
       "pool-1-thread-2":
      	at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2017)
      	- waiting to lock <0x00007f2b5f026000> (a org.apache.hadoop.mapred.JobTracker)
      	at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2483)
      	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
      	at org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:2152)
      	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
      	at org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:2169)
      	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
      	at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2245)
      	- locked <0x00007f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
      	at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:86)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      	at java.lang.Thread.run(Thread.java:619)
      

        Attachments

        1. MAPREDUCE-805-v1.7.patch
          23 kB
          Amar Kamat
        2. MAPREDUCE-805-v1.6.patch
          23 kB
          Amar Kamat
        3. MAPREDUCE-805-v1.3.patch
          10 kB
          Amar Kamat
        4. MAPREDUCE-805-v1.2.patch
          10 kB
          Amar Kamat
        5. MAPREDUCE-805-v1.12-branch-0.20.patch
          22 kB
          Amar Kamat
        6. MAPREDUCE-805-v1.12.patch
          19 kB
          Amar Kamat
        7. MAPREDUCE-805-v1.11-branch-0.20.patch
          22 kB
          Amar Kamat
        8. MAPREDUCE-805-v1.11.patch
          19 kB
          Amar Kamat
        9. MAPREDUCE-805-v1.1.patch
          8 kB
          Amar Kamat

          Issue Links

            Activity

              People

              • Assignee:
                amar_kamat Amar Kamat
                Reporter:
                michael.tamm Michael Tamm
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: