Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11674

reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Scenario:
      1. 3 Node cluster with "dfs.client.block.write.replace-datanode-on-failure.policy" as DEFAULT
      Block is written with x data.
      2. One of the Datanode, NOT the first DN, is down
      3. Client tries to append data to block and fails since one DN is down.
      4. calls recoverLease() on the file.
      5. Successfull recovery happens.

      Issue:
      1. DNs which were connected from client before encountering mirror down, will have the reservedSpaceForReplicas incremented, BUT never decremented.
      2. So in long run DN's all space will be in reservedSpaceForReplicas resulting OutOfSpace errors.

      Attachments

        1. HDFS-11674-01.patch
          6 kB
          Vinayakumar B
        2. HDFS-11674-02.patch
          5 kB
          Vinayakumar B
        3. HDFS-11674-03.patch
          6 kB
          Vinayakumar B
        4. HDFS-11674-branch-2.7-03.patch
          7 kB
          Vinayakumar B

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vinayakumarb Vinayakumar B
            vinayakumarb Vinayakumar B
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment