Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6002

MR task should prevent report error to AM when process is shutting down

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.5.0
    • 2.5.0
    • task
    • None
    • Reviewed

    Description

      With MAPREDUCE-5900, preempted MR task should not be treat as failed.
      But it is still possible a MR task fail and report to AM when preemption take effect and the AM hasn't received completed container from RM yet. It will cause the task attempt marked failed instead of preempted.

      An example is FileSystem has shutdown hook, it will close all FileSystem instance, if at the same time, the FileSystem is in-use (like reading split details from HDFS), MR task will fail and report the fatal error to MR AM. An exception will be raised:

      2014-07-22 01:46:19,613 FATAL [IPC Server handler 10 on 56903] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1405985051088_0018_m_000025_0 - exited : java.io.IOException: Filesystem closed
      	at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
      	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:776)
      	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837)
      	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:645)
      	at java.io.DataInputStream.readByte(DataInputStream.java:265)
      	at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
      	at org.apache.hadoop.io.WritableUtils.readVIntInRange(WritableUtils.java:348)
      	at org.apache.hadoop.io.Text.readString(Text.java:464)
      	at org.apache.hadoop.io.Text.readString(Text.java:457)
      	at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:357)
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
      	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
      	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
      

      We should prevent this, because it is possible other exceptions happen when shutting down, we shouldn't report any of such exceptions to AM.

      Attachments

        1. MR-6002.patch
          7 kB
          Wangda Tan

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            leftnoteasy Wangda Tan
            leftnoteasy Wangda Tan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment