[HDFS-8955] Support 'hedged' write in DFSClient - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.6.0
Fix Version/s: None
Component/s: hdfs-client
Labels:
None

Hadoop Flags:

Reviewed
Release Note:

Hide
If a write from a block is slow, start up another parallel, 'hedged' write against a different set of replica. We need to get different set of replica/data pipeline from NN. We then take the result of which ever write returns first (the outstanding write is cancelled). This 'hedged' write feature will help rein in the outliers, the odd write that takes a long time because it hit a bad patch on the disc, etc.

This feature is off by default. To enable this feature, set <code>dfs.client.hedged.write.threadpool.size</code> to a positive number. The threadpool size is how many threads to dedicate to the running of these 'hedged', concurrent writes in your client.

Then set <code>dfs.client.hedged.write.threshold.millis</code> to the number of milliseconds to wait before starting up a 'hedged' write. For example, if you set this property to 10, then if a write has not returned within 10 milliseconds, we will start up a new read against a different block replica.

This feature emits new metrics:

+ hedgedWriteOps
+ hedgeWriteOpsWin -- how many times the hedged write 'beat' the original write
+ hedgedWriteOpsInCurThread -- how many times we went to do a hedged write but we had to run it in the current thread because dfs.client.hedged.write.threadpool.size was at a maximum.

Show
If a write from a block is slow, start up another parallel, 'hedged' write against a different set of replica. We need to get different set of replica/data pipeline from NN. We then take the result of which ever write returns first (the outstanding write is cancelled). This 'hedged' write feature will help rein in the outliers, the odd write that takes a long time because it hit a bad patch on the disc, etc. This feature is off by default. To enable this feature, set <code>dfs.client.hedged.write.threadpool.size</code> to a positive number. The threadpool size is how many threads to dedicate to the running of these 'hedged', concurrent writes in your client. Then set <code>dfs.client.hedged.write.threshold.millis</code> to the number of milliseconds to wait before starting up a 'hedged' write. For example, if you set this property to 10, then if a write has not returned within 10 milliseconds, we will start up a new read against a different block replica. This feature emits new metrics: + hedgedWriteOps + hedgeWriteOpsWin -- how many times the hedged write 'beat' the original write + hedgedWriteOpsInCurThread -- how many times we went to do a hedged write but we had to run it in the current thread because dfs.client.hedged.write.threadpool.size was at a maximum.

Description

We do have hedged read which serves redundancy on read failures due to bad sector/patch in disk. We need to have similar feature for hdfs write. This feature may come with cost but its something to must have for use case which needs to guarantee write success regardless of degraded disk health. Defination of degraded disk is highly debatable but this is what I would define. "Degraded disk is the disk which fails to read and write intermittently"

Attachments

Issue Links

relates to

HDFS-5776 Support 'hedged' reads in DFSClient

Closed

Activity

People

Assignee:: Unassigned

Reporter:: bijaya

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 26/Aug/15 03:46

Updated:: 07/Oct/15 23:32