Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.2.0
-
None
-
None
Description
Eight hours after a job failed (it executed for about 14 hours prior to failing), many task trackers keep throwing the exceptions below:
060523 121732 Server handler 0 on 50040 caught: java.io.FileNotFoundException: LocalFS
java.io.FileNotFoundException: LocalFS
at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
060523 121814 task_0006_r_000123_0 copy failed: task_0006_m_046105_0 from node5:50040
java.net.SocketTimeoutException: timed out waiting for rpc response
at org.apache.hadoop.ipc.Client.call(Client.java:305)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)
060523 121814 task_0006_r_000123_0 0.13023989% reduce > copy > task_0006_m_046105_0@node5:50040
060523 121814 task_0006_r_000123_0 Copying task_0006_m_048815_0 output from node6
060523 121817 SEVERE Can't open map output:/hadoop/mapred/local/task_0006_m_031921_0/part-152.out
java.io.FileNotFoundException: LocalFS
at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
060523 121817 Unknown child with bad map output: task_0006_m_031921_0. Ignored.
060523 121817 Server handler 1 on 50040 caught: java.io.FileNotFoundException: LocalFS
java.io.FileNotFoundException: LocalFS
at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
060523 121914 task_0006_r_000123_0 copy failed: task_0006_m_048815_0 from node6:50040
java.net.SocketTimeoutException: timed out waiting for rpc response
at org.apache.hadoop.ipc.Client.call(Client.java:305)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)
Attachments
Issue Links
- is duplicated by
-
HADOOP-225 tasks are left over when a job fails
- Closed