Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1324

FSError encountered by one running task should not be fatal to other tasks on that node

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.12.3
    • 0.13.0
    • None
    • None

    Description

      Currently, if one task encounters a FSError, it reports that to the TaskTracker and the TaskTracker reinitializes itself and effectively loses state of all the other running tasks too. This can probably be improved especially after the fix for HADOOP-1252. The TaskTracker should probably avoid reinitializing itself and instead get blacklisted for that job. Other tasks should be allowed to continue as long as they can (complete successfully, or, fail either due to disk problems or otherwise).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            acmurthy Arun Murthy Assign to me
            ddas Devaraj Das
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment