[HDFS-13103] HDFS Client write acknowledgement timeout should not depend on read timeout - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.8.0, 3.0.0-alpha1
Fix Version/s: None
Component/s: datanode, hdfs-client
Labels:
None
Environment:

CDH5.7.0 and above + Cloudera Manager. HBase Region Server.

Description

~~HDFS-8311~~ added a timeout for client write acknowledgement for both

transferring blocks
writing to a DataNode.

The timeout shares the same configuration as client read timeout (dfs.client.socket-timeout).

While I agree having a timeout is good, it does not make sense for the write acknowledgement timeout to depend on read timeout. We saw a case where cluster admin wants to reduce HBase RegionServer read timeout so as to detect DataNode crash quickly, but did not realize it affects write acknowledgement timeout.

In the end, the effective DataNode write timeout is shorter than the effective client write acknowledgement timeout. If the last two DataNodes in the write pipeline crashes, client would think the first DataNode is faulty (the DN appears unresponsive because it is still waiting for the ack from downstream DNs), dropping it and then HBase RS would crash because it is unable to write to any good DataNode. This scenario is possible during a rack failure.

This problem is even worse for Cloudera Manager-managed cluster. By default, CM-managed HBase RegionServer sets dfs.client.block.write.replace-datanode-on-failure.enable = false. Even one unresponsive DataNode could crash HBase RegionServer.

I am raising this Jira to discuss two possible solutions

add a new config for write acknowledgement timeout. Do not depend on read timeout
or, update the description of dfs.client.socket-timeout in core-default.xml so that admin is aware write acknowledgement timeout depends on this configuration.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-13103.001.patch
03/Feb/18 04:13
7 kB
Wei-Chiu Chuang

Issue Links

causes

HDFS-13524 Occasional "All datanodes are bad" error in TestLargeBlock#testLargeBlockSize

Resolved

relates to

HDFS-8311 DataStreamer.transfer() should timeout the socket InputStream.

Resolved

Activity

People

Assignee:: Wei-Chiu Chuang

Reporter:: Wei-Chiu Chuang

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 02/Feb/18 21:13

Updated:: 08/Mar/21 09:23