[HDFS-11674] reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
Component/s: datanode
Labels:
None

Target Version/s:

2.7.4, 2.8.1
Hadoop Flags:

Reviewed

Description

Scenario:
1. 3 Node cluster with "dfs.client.block.write.replace-datanode-on-failure.policy" as DEFAULT
Block is written with x data.
2. One of the Datanode, NOT the first DN, is down
3. Client tries to append data to block and fails since one DN is down.
4. calls recoverLease() on the file.
5. Successfull recovery happens.

Issue:
1. DNs which were connected from client before encountering mirror down, will have the reservedSpaceForReplicas incremented, BUT never decremented.
2. So in long run DN's all space will be in reservedSpaceForReplicas resulting OutOfSpace errors.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-11674-01.patch
19/Apr/17 06:20
6 kB
Vinayakumar B
HDFS-11674-02.patch
02/May/17 06:26
5 kB
Vinayakumar B
HDFS-11674-03.patch
11/May/17 05:42
6 kB
Vinayakumar B
HDFS-11674-branch-2.7-03.patch
12/May/17 03:03
7 kB
Vinayakumar B

Activity

People

Assignee:: Vinayakumar B

Reporter:: Vinayakumar B

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 19/Apr/17 06:00

Updated:: 24/Apr/18 20:49

Resolved:: 12/May/17 08:59