Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-698

When DFS client fails to read from a datanode, the failed datanode is not excluded from target reselection

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.9.0
    • None
    • None

    Description

      In the method read(byte buf[ ], int off, int len) of DFSInputStream, when read fails, it calls "blockSeekTo" to reselect a datanode. However, the failed datanode does not feed back to blockSeekTo. The datanode selection algorithm works as follows:

      • If the machine that the client is running on has a local copy, return the local machine;
      • Otherwise, randomly pick up one location.

      When the failed data node info does not feed back to target reselection, this leads to two flaws:
      1. When a client fails to read from the local copy, for example, because of the checksum error, the local machine will always be chosen in retries.
      2. Random selection may still return the same failed node.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            milindb Milind Barve Assign to me
            hairong Hairong Kuang
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment