Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11674

reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Scenario:
      1. 3 Node cluster with "dfs.client.block.write.replace-datanode-on-failure.policy" as DEFAULT
      Block is written with x data.
      2. One of the Datanode, NOT the first DN, is down
      3. Client tries to append data to block and fails since one DN is down.
      4. calls recoverLease() on the file.
      5. Successfull recovery happens.

      Issue:
      1. DNs which were connected from client before encountering mirror down, will have the reservedSpaceForReplicas incremented, BUT never decremented.
      2. So in long run DN's all space will be in reservedSpaceForReplicas resulting OutOfSpace errors.

      Attachments

        1. HDFS-11674-01.patch
          6 kB
          Vinayakumar B
        2. HDFS-11674-02.patch
          5 kB
          Vinayakumar B
        3. HDFS-11674-03.patch
          6 kB
          Vinayakumar B
        4. HDFS-11674-branch-2.7-03.patch
          7 kB
          Vinayakumar B

        Activity

          People

            vinayakumarb Vinayakumar B
            vinayakumarb Vinayakumar B
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: