Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11164

Mover should avoid unnecessary retries if the block is pinned

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha2
    • balancer & mover
    • None

    Description

      When mover is trying to move a pinned block to another datanode, it will internally hits the following IOException and mark the block movement as failure. Since the Mover has dfs.mover.retry.max.attempts configs, it will continue moving this block until it reaches retryMaxAttempts. If the block movement failure(s) are only due to block pinning, then retry is unnecessary. The idea of this jira is to avoid retry attempts of pinned blocks as they won't be able to move to a different node.

      2016-11-22 10:56:10,537 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: Failed to move blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to 127.0.0.1:19758:ARCHIVE through 127.0.0.1:19501
      java.io.IOException: Got error, status=ERROR, status message opReplaceBlock BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 received exception java.io.IOException: Got error, status=ERROR, status message Not able to copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy block BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 from /127.0.0.1:19501, reportedBlock move is failed
      	at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
      	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
      	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
      	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
      	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      

      Attachments

        1. HDFS-11164-00.patch
          23 kB
          Rakesh Radhakrishnan
        2. HDFS-11164-01.patch
          23 kB
          Rakesh Radhakrishnan
        3. HDFS-11164-02.patch
          32 kB
          Rakesh Radhakrishnan
        4. HDFS-11164-03.patch
          32 kB
          Rakesh Radhakrishnan

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rakeshr Rakesh Radhakrishnan
            rakeshr Rakesh Radhakrishnan
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment