Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1152

Reduce task hang failing in MapOutputCopier.copyOutput

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.13.0
    • None
    • None

    Description

      We had couple of reduce tasks hang repeating the output below.

      2007-03-22 23:57:16,296 WARN org.apache.hadoop.mapred.TaskRunner: java.io.IOException: Path /hadoop/mapred/local/task_0026_r_000307_0/map_7854.out already exists
      at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.rename(InMemoryFileSystem.java:246)
      at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:471)
      at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:336)
      at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:274)

      2007-03-22 23:57:16,296 WARN org.apache.hadoop.mapred.TaskRunner: task_0026_r_000307_0 adding host ______ to penalty box, next contact in 192 seconds

      ===============================
      Before the above output, there was

      2007-03-22 18:15:24,274 ERROR org.apache.hadoop.mapred.TaskRunner: Map output copy failure: java.lang.NullPointerException
      at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:416)
      at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getLength(InMemoryFileSystem.java:286)
      at org.apache.hadoop.fs.FilterFileSystem.getLength(FilterFileSystem.java:178)
      at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:340)
      at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:274)

      Attachments

        1. 1152.patch
          1 kB
          Tahir Hashmi

        Activity

          People

            tahir Tahir Hashmi
            knoguchi Koji Noguchi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: