Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3027

JobTracker shuts down during initialization if the NameNode is down

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.16.0
    • 0.16.2
    • None
    • None

    Description

      When the JobTracker is initializing and trying to connect to the NameNode, it shuts itself down if the NameNode is unreachable for more than one iteration of the connect loop. It can be easily reproduced if the JobTracker is started before the NameNode is started. The JobTracker will shut itself down in a few seconds. The problem seems to be with adding a shutdown hook in the FileSystem in the case where the same hook has been added before.

      2008-03-17 09:45:20,979 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 9101
      2008-03-17 09:45:20,979 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030
      2008-03-17 09:45:21,374 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s).
      2008-03-17 09:45:22,377 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s).
      2008-03-17 09:45:23,380 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s).
      2008-03-17 09:45:24,383 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s).
      2008-03-17 09:45:25,385 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s).
      2008-03-17 09:45:26,388 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s).
      2008-03-17 09:45:27,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s).
      2008-03-17 09:45:28,394 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s).
      2008-03-17 09:45:29,397 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
      2008-03-17 09:45:30,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 10 time(s).
      2008-03-17 09:45:31,406 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: /tmp/hadoop/mapred/system
      java.net.ConnectException: Connection refused
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
      at sun.nio.ch.SocketAdaptor.connect(Unknown Source)
      at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:174)
      at org.apache.hadoop.ipc.Client.getConnection(Client.java:623)
      at org.apache.hadoop.ipc.Client.call(Client.java:546)
      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:211)
      at org.apache.hadoop.dfs.$Proxy4.getProtocolVersion(Unknown Source)
      at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:312)
      at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:94)
      at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:158)
      at org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:69)
      at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1255)
      at org.apache.hadoop.fs.FileSystem.access$400(FileSystem.java:53)
      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1272)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:191)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:96)
      at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:702)
      at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:135)
      at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2266)
      2008-03-17 09:45:41,410 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.IllegalArgumentException: Hook previously registered
      at java.lang.ApplicationShutdownHooks.add(Unknown Source)
      at java.lang.Runtime.addShutdownHook(Unknown Source)
      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1269)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:191)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:96)
      at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:702)
      at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:135)
      at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2266)

      2008-03-17 09:45:41,412 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG:

      Attachments

        1. patch-3027.txt
          0.7 kB
          Amareshwari Sriramadasu

        Activity

          People

            amareshwari Amareshwari Sriramadasu
            amareshwari Amareshwari Sriramadasu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: