Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-244

very long cleanup after a job fails

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.2.0
    • 0.4.0
    • None
    • None

    Description

      Eight hours after a job failed (it executed for about 14 hours prior to failing), many task trackers keep throwing the exceptions below:

      060523 121732 Server handler 0 on 50040 caught: java.io.FileNotFoundException: LocalFS
      java.io.FileNotFoundException: LocalFS
      at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
      at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
      at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
      at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
      at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
      at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
      at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
      060523 121814 task_0006_r_000123_0 copy failed: task_0006_m_046105_0 from node5:50040
      java.net.SocketTimeoutException: timed out waiting for rpc response
      at org.apache.hadoop.ipc.Client.call(Client.java:305)
      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
      at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
      at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)
      060523 121814 task_0006_r_000123_0 0.13023989% reduce > copy > task_0006_m_046105_0@node5:50040
      060523 121814 task_0006_r_000123_0 Copying task_0006_m_048815_0 output from node6
      060523 121817 SEVERE Can't open map output:/hadoop/mapred/local/task_0006_m_031921_0/part-152.out
      java.io.FileNotFoundException: LocalFS
      at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
      at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
      at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
      at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
      at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
      at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
      at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
      060523 121817 Unknown child with bad map output: task_0006_m_031921_0. Ignored.
      060523 121817 Server handler 1 on 50040 caught: java.io.FileNotFoundException: LocalFS
      java.io.FileNotFoundException: LocalFS
      at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
      at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
      at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
      at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
      at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
      at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
      at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
      060523 121914 task_0006_r_000123_0 copy failed: task_0006_m_048815_0 from node6:50040
      java.net.SocketTimeoutException: timed out waiting for rpc response
      at org.apache.hadoop.ipc.Client.call(Client.java:305)
      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
      at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
      at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)

      Attachments

        Issue Links

          Activity

            People

              sameerp Sameer Paranjpye
              yarnon Yoram Arnon
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: