[HDFS-1606] Provide a stronger data guarantee in the write pipeline - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.23.0
Fix Version/s: 0.23.0
Component/s: datanode, hdfs-client, namenode
Labels:
None

Hadoop Flags:

Incompatible change, Reviewed
Release Note:

Hide
Added two configuration properties, dfs.client.block.write.replace-datanode-on-failure.enable and dfs.client.block.write.replace-datanode-on-failure.policy. Added a new feature to replace datanode on failure in DataTransferProtocol. Added getAdditionalDatanode(..) in ClientProtocol.

Show
Added two configuration properties, dfs.client.block.write.replace-datanode-on-failure.enable and dfs.client.block.write.replace-datanode-on-failure.policy. Added a new feature to replace datanode on failure in DataTransferProtocol. Added getAdditionalDatanode(..) in ClientProtocol.

Description

In the current design, if there is a datanode/network failure in the write pipeline, DFSClient will try to remove the failed datanode from the pipeline and then continue writing with the remaining datanodes. As a result, the number of datanodes in the pipeline is decreased. Unfortunately, it is possible that DFSClient may incorrectly remove a healthy datanode but leave the failed datanode in the pipeline because failure detection may be inaccurate under erroneous conditions.

We propose to have a new mechanism for adding new datanodes to the pipeline in order to provide a stronger data guarantee.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

h1606_20110210.patch
10/Feb/11 08:51
7 kB
Tsz-wo Sze
h1606_20110211.patch
11/Feb/11 09:01
20 kB
Tsz-wo Sze
h1606_20110217.patch
17/Feb/11 09:29
25 kB
Tsz-wo Sze
h1606_20110228.patch
28/Feb/11 23:14
34 kB
Tsz-wo Sze
h1606_20110404.patch
05/Apr/11 00:12
34 kB
Tsz-wo Sze
h1606_20110405.patch
06/Apr/11 03:03
35 kB
Tsz-wo Sze
h1606_20110405b.patch
06/Apr/11 04:23
35 kB
Tsz-wo Sze
h1606_20110406.patch
06/Apr/11 21:20
41 kB
Tsz-wo Sze
h1606_20110406b.patch
07/Apr/11 02:18
43 kB
Tsz-wo Sze
h1606_20110407.patch
07/Apr/11 18:43
48 kB
Tsz-wo Sze
h1606_20110407b.patch
07/Apr/11 21:10
48 kB
Tsz-wo Sze
h1606_20110407c.patch
08/Apr/11 00:19
50 kB
Tsz-wo Sze
h1606_20110408.patch
08/Apr/11 18:57
49 kB
Tsz-wo Sze
h1606_20110408b.patch
09/Apr/11 00:54
52 kB
Tsz-wo Sze

Issue Links

is related to

HDFS-1595 DFSClient may incorrectly detect datanode failure

Resolved

HDFS-1599 Umbrella Jira for Improving HBASE support in HDFS

Open

HDFS-265 Revisit append

Closed

HDFS-1785 Cleanup BlockReceiver and DataXceiver

Closed

HDFS-1789 Refactor frequently used codes from DFSOutputStream, BlockReceiver and DataXceiver

Closed

HDFS-1817 Split TestFiDataTransferProtocol.java into two files

Closed

(1 is related to)

Sub-Tasks

1.	Add a method to BlockPlacementPolicy for not removing the chosen nodes	Resolved	Tsz-wo Sze
2.	Transfer RBW between datanodes	Closed	Tsz-wo Sze
3.	Add a new DataTransferProtocol operation, Op.TRANSFER_BLOCK, instead of using RPC	Closed	Tsz-wo Sze

Activity

People

Assignee:: Tsz-wo Sze

Reporter:: Tsz-wo Sze

Votes:: 0 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 31/Jan/11 23:40

Updated:: 23/Jun/11 17:41

Resolved:: 12/Apr/11 17:43