Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5606

JobTracker blocked for DFSClient: Failed recovery attempt

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 1.0.3
    • Fix Version/s: None
    • Component/s: jobtracker
    • Labels:
      None
    • Environment:

      centos 5.8 jdk 1.7

    • Release Note:
      Hide
      jobTracker 's logs :

      2013-11-04 12:29:24,651 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #0 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
              at org.apache.hadoop.ipc.Client.call(Client.java:1075)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
              at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
              at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
              at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
              at java.io.DataInputStream.read(DataInputStream.java:132)
              at org.apache.hadoop.security.SaslInputStream.read(SaslInputStream.java:239)
              at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749)
      2013-11-04 12:29:25,661 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #1 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:601)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:26,669 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #2 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:601)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:27,676 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #3 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:601)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.11
      5:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004
      2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx
      .23.48:1004
      org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this req
      uest to recover it.
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:601)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 failed because recovery
       from primary datanode xx.xx.23.48:1004 failed 5 times. Pipeline was xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004. Will retry...
      2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004
      2013-11-04 12:29:29,689 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode xx.xx.23.48:1004
      org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.
                                                                                                                                     
      2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004
      2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx.23.48:1004
      org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:601)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 failed because recovery f
      rom primary datanode xx.xx.23.48:1004 failed 5 times. Pipeline was xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004. Will retry...
      2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004
      2013-11-04 12:29:29,689 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode xx.xx.23.48:1004
      org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
                                                                                                                                      
      2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004
      2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx.23.48:1004
      org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:601)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 failed because recovery from primary datanode xx.xx.23.48:1004 failed 5 times.
      Pipeline was xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004. Will retry...
      2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004
      2013-11-04 12:29:29,689 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode xx.xx.23.48:1004
      org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.
      2013-11-04 12:29:24,651 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #0 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
              at org.apache.hadoop.ipc.Client.call(Client.java:1075)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
              at java.io.DataInputStream.read(DataInputStream.java:132)
              at org.apache.hadoop.security.SaslInputStream.read(SaslInputStream.java:239)
              at java.io.FilterInputStream.read(FilterInputStream.java:116)
              at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:342)
              at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
              at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
              at java.io.DataInputStream.readInt(DataInputStream.java:370)
              at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:804)
              at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749)
      2013-11-04 12:29:25,654 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:25,661 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #1 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:26,663 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:26,669 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #2 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:27,671 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:27,676 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #3 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
      2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx.23.48:1004
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:601)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
              at $Proxy8.recoverBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
      2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
              at org.apache.hadoop.ipc.Client.call(Client.java:1070)
              at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)
      2013-11-04 12:30:22,883 INFO org.mortbay.log: index:34
      2013-11-04 12:30:59,778 INFO org.mortbay.log: index:34
      2013-11-04 12:31:16,161 INFO org.mortbay.log: index:34
      2013-11-04 12:30:30,692 WARN org.apache.hadoop.ipc.Client: Encountered java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be
      ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30348 remote=/xx.xx.23.82:50020] while trying to establish SASL connecti
      on to the server. Will retry SASL connection to server with principal hadoop/a082@DATA
      2013-11-04 12:30:33,375 INFO org.mortbay.log: index:34
      2013-11-04 12:30:33,771 INFO org.mortbay.log: index:34
      2013-11-04 12:30:59,778 INFO org.mortbay.log: index:34
      2013-11-04 12:31:09,851 INFO org.mortbay.log: index:40
      2013-11-04 12:31:15,345 INFO org.mortbay.log: index:34
      2013-11-04 12:31:16,161 INFO org.mortbay.log: index:34
      2013-11-04 12:31:16,635 INFO org.mortbay.log: index:34
      2013-11-04 12:31:26,504 INFO org.mortbay.log: index:40
      2013-11-04 12:31:26,521 INFO org.mortbay.log: index:40
      2013-11-04 12:31:34,674 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:blk_4347580689772316942_201043980 cause:java.net
      .SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23
      .162:30438 remote=/xx.xx.23.82:50020]
      2013-11-04 12:31:34,674 WARN org.apache.hadoop.ipc.Client: Encountered java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be
      ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30438 remote=/xx.xx.23.82:50020] while trying to establish SASL connection to the server. Will retry SASL connection to server with principal hadoop/a082@DATA
      2013-11-04 12:32:37,666 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:blk_4347580689772316942_201043980 cause:java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30529 remote=/xx.xx.23.82:50020]
      2013-11-04 12:32:37,666 WARN org.apache.hadoop.ipc.Client: Encountered java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to beready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30529 remote=/xx.xx.23.82:50020] while trying to establish SASL connection to the server. Will retry SASL connection to server with principal hadoop/082@DATA
      Show
      jobTracker 's logs : 2013-11-04 12:29:24,651 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #0 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)         at org.apache.hadoop.ipc.Client.call(Client.java:1075)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)         at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)         at java.io.DataInputStream.read(DataInputStream.java:132)         at org.apache.hadoop.security.SaslInputStream.read(SaslInputStream.java:239)         at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749) 2013-11-04 12:29:25,661 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #1 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at java.lang.reflect.Method.invoke(Method.java:601)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:26,669 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #2 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at java.lang.reflect.Method.invoke(Method.java:601)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:27,676 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #3 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at java.lang.reflect.Method.invoke(Method.java:601)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.11 5:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004 2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx .23.48:1004 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this req uest to recover it.         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at java.lang.reflect.Method.invoke(Method.java:601)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 failed because recovery  from primary datanode xx.xx.23.48:1004 failed 5 times. Pipeline was xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004. Will retry... 2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004 2013-11-04 12:29:29,689 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode xx.xx.23.48:1004 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.                                                                                                                                 2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004 2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx.23.48:1004 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at java.lang.reflect.Method.invoke(Method.java:601)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 failed because recovery f rom primary datanode xx.xx.23.48:1004 failed 5 times. Pipeline was xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004. Will retry... 2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004 2013-11-04 12:29:29,689 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode xx.xx.23.48:1004 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)                                                                                                                                  2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004 2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx.23.48:1004 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it.         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at java.lang.reflect.Method.invoke(Method.java:601)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 failed because recovery from primary datanode xx.xx.23.48:1004 failed 5 times. Pipeline was xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004. Will retry... 2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 in pipeline xx.xx.24.115:1004, xx.xx.23.48:1004, xx.xx.23.82:1004: bad datanode xx.xx.24.115:1004 2013-11-04 12:29:29,689 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode xx.xx.23.48:1004 org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block blk_4347580689772316942_201043980 is already being recovered, ignoring this request to recover it. 2013-11-04 12:29:24,651 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #0 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)         at org.apache.hadoop.ipc.Client.call(Client.java:1075)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)         at java.io.DataInputStream.read(DataInputStream.java:132)         at org.apache.hadoop.security.SaslInputStream.read(SaslInputStream.java:239)         at java.io.FilterInputStream.read(FilterInputStream.java:116)         at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:342)         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)         at java.io.DataInputStream.readInt(DataInputStream.java:370)         at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:804)         at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749) 2013-11-04 12:29:25,654 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:25,661 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #1 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:26,663 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:26,669 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #2 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:27,671 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:27,676 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #3 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:28,678 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004 2013-11-04 12:29:28,683 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #4 from primary datanode xx.xx.23.48:1004         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1884)         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2047)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)         at java.lang.reflect.Method.invoke(Method.java:601)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)         at $Proxy8.recoverBlock(Unknown Source)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) 2013-11-04 12:29:29,685 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_4347580689772316942_201043980 bad datanode[0] xx.xx.24.115:1004         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)         at org.apache.hadoop.ipc.Client.call(Client.java:1070)         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3121) 2013-11-04 12:30:22,883 INFO org.mortbay.log: index:34 2013-11-04 12:30:59,778 INFO org.mortbay.log: index:34 2013-11-04 12:31:16,161 INFO org.mortbay.log: index:34 2013-11-04 12:30:30,692 WARN org.apache.hadoop.ipc.Client: Encountered java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30348 remote=/xx.xx.23.82:50020] while trying to establish SASL connecti on to the server. Will retry SASL connection to server with principal hadoop/a082@DATA 2013-11-04 12:30:33,375 INFO org.mortbay.log: index:34 2013-11-04 12:30:33,771 INFO org.mortbay.log: index:34 2013-11-04 12:30:59,778 INFO org.mortbay.log: index:34 2013-11-04 12:31:09,851 INFO org.mortbay.log: index:40 2013-11-04 12:31:15,345 INFO org.mortbay.log: index:34 2013-11-04 12:31:16,161 INFO org.mortbay.log: index:34 2013-11-04 12:31:16,635 INFO org.mortbay.log: index:34 2013-11-04 12:31:26,504 INFO org.mortbay.log: index:40 2013-11-04 12:31:26,521 INFO org.mortbay.log: index:40 2013-11-04 12:31:34,674 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:blk_4347580689772316942_201043980 cause:java.net .SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23 .162:30438 remote=/xx.xx.23.82:50020] 2013-11-04 12:31:34,674 WARN org.apache.hadoop.ipc.Client: Encountered java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30438 remote=/xx.xx.23.82:50020] while trying to establish SASL connection to the server. Will retry SASL connection to server with principal hadoop/a082@DATA 2013-11-04 12:32:37,666 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:blk_4347580689772316942_201043980 cause:java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30529 remote=/xx.xx.23.82:50020] 2013-11-04 12:32:37,666 WARN org.apache.hadoop.ipc.Client: Encountered java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to beready for read. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.23.162:30529 remote=/xx.xx.23.82:50020] while trying to establish SASL connection to the server. Will retry SASL connection to server with principal hadoop/082@DATA

      Description

      when a datanode was crash,the server can ping ok,but can not call rpc ,and also can not ssh login. and then jobTracker may be request a block on this datanode.
      it will happened ,the JobTracker can not work,the webUI is also unwork,hadoop job -list also unwork,the jobTracker logs no other info .

      and then we need to restart the datanode.
      then jobTraker can work too,but the taskTracker num come to zero,
      we need run : hadoop mradmin -refreshNodes
      then the JobTracker begin to add taskTraker ,but is very slowly.

      this problem occur 5time in 2weeks.

        Activity

        firegun created issue -
        firegun made changes -
        Field Original Value New Value
        Assignee firegun [ firegun ]
        firegun made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Chris Nauroth made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Assignee firegun [ firegun ]

          People

          • Assignee:
            Unassigned
            Reporter:
            firegun
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:

              Development