Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5198

Race condition in cleanup during task tracker renint with LinuxTaskController

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.0
    • Component/s: tasktracker
    • Labels:
      None

      Description

      This was noticed when job tracker would be restarted while jobs were running and would ask the task tracker to reinitialize.

      Tasktracker would fail with an error like

      013-04-27 20:19:09,627 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /grid/0/hdp/mapred/local,/grid/1/hdp/mapred/local,/grid/2/hdp/mapred/local,/grid/3/hdp/mapred/local,/grid/4/hdp/mapred/local,/grid/5/hdp/mapred/local
      2013-04-27 20:19:09,628 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 42075 caught: java.nio.channels.ClosedChannelException
      	at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
      	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
      	at org.apache.hadoop.ipc.Server.channelWrite(Server.java:1717)
      	at org.apache.hadoop.ipc.Server.access$2000(Server.java:98)
      	at org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:744)
      	at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:808)
      	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1433)
      
      2013-04-27 20:19:09,628 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 42075: exiting
      2013-04-27 20:19:10,414 ERROR org.apache.hadoop.mapred.TaskTracker: Got fatal exception while reinitializing TaskTracker: org.apache.hadoop.util.Shell$ExitCodeException: 
      	at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
      	at org.apache.hadoop.util.Shell.run(Shell.java:182)
      	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
      	at org.apache.hadoop.mapred.LinuxTaskController.deleteAsUser(LinuxTaskController.java:281)
      	at org.apache.hadoop.mapred.TaskTracker.deleteUserDirectories(TaskTracker.java:779)
      	at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:816)
      	at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2704)
      	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3934)
      

        Attachments

          Activity

            People

            • Assignee:
              arpitgupta Arpit Gupta
              Reporter:
              arpitgupta Arpit Gupta
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: