Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Hadoop 0.19.1 in a test environment with 2 Task Trackers

      Description

      Found one Java-level deadlock:
      =============================
      "SocketListener0-26":
      waiting to lock monitor 0x08ed5ce4 (object 0x567924c0, a org.apache.hadoop.mapred.JobTracker),
      which is held by "IPC Server handler 1 on 9001"
      "IPC Server handler 1 on 9001":
      waiting to lock monitor 0x08f7da88 (object 0x5744f5b8, a org.apache.hadoop.mapred.JobInProgress),
      which is held by "initJobs"
      "initJobs":
      waiting to lock monitor 0x08ed5ce4 (object 0x567924c0, a org.apache.hadoop.mapred.JobTracker),
      which is held by "IPC Server handler 1 on 9001"

      Java stack information for the threads listed above:
      ===================================================
      "SocketListener0-26":
      at org.apache.hadoop.mapred.JobTracker.getClusterStatus(JobTracker.java:2313)

      • waiting to lock <0x567924c0> (a org.apache.hadoop.mapred.JobTracker)
        at org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:104)
        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
        at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
        at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
        at org.mortbay.http.HttpServer.service(HttpServer.java:954)
        at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
        at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
        at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
        at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
        at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
        at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
        "IPC Server handler 1 on 9001":
        at org.apache.hadoop.mapred.JobInProgress.obtainTaskCleanupTask(JobInProgress.java:935)
      • waiting to lock <0x5744f5b8> (a org.apache.hadoop.mapred.JobInProgress)
        at org.apache.hadoop.mapred.JobTracker.getSetupAndCleanupTasks(JobTracker.java:2167)
      • locked <0x56795708> (a java.util.TreeMap)
      • locked <0x567924c0> (a org.apache.hadoop.mapred.JobTracker)
        at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1902)
      • locked <0x567924c0> (a org.apache.hadoop.mapred.JobTracker)
        at sun.reflect.GeneratedMethodAccessor3278.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
        "initJobs":
        at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:1539)
      • waiting to lock <0x567924c0> (a org.apache.hadoop.mapred.JobTracker)
        at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2320)
      • locked <0x5744f5b8> (a org.apache.hadoop.mapred.JobInProgress)
        at org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:2004)
      • locked <0x5744f5b8> (a org.apache.hadoop.mapred.JobInProgress)
        at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:472)
      • locked <0x575b7ec8> (a org.apache.hadoop.mapred.JobInProgress$JobInitKillStatus)
      • locked <0x5744f5b8> (a org.apache.hadoop.mapred.JobInProgress)
        at org.apache.hadoop.mapred.EagerTaskInitializationListener$JobInitThread.run(EagerTaskInitializationListener.java:55)

      Found 1 deadlock.

        Activity

        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        I think this is fixed as part of HADOOP-5285 in 0.19.2 and above. Can you please confirm?

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - I think this is fixed as part of HADOOP-5285 in 0.19.2 and above. Can you please confirm?
        Hide
        amareshwari Amareshwari Sriramadasu added a comment -

        This will be fixed in trunk by MAPREDUCE-805

        Show
        amareshwari Amareshwari Sriramadasu added a comment - This will be fixed in trunk by MAPREDUCE-805
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        As Amareshwari notes, this is fixed on trunk. Please reopen if it exists on 0.19.2 and above and if you wish the fix to be backported.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - As Amareshwari notes, this is fixed on trunk. Please reopen if it exists on 0.19.2 and above and if you wish the fix to be backported.

          People

          • Assignee:
            Unassigned
            Reporter:
            davelatham Dave Latham
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development