[HDFS-289] HDFS should blacklist datanodes that are not performing well - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

On a large cluster, a few datanodes could be under-performing. There were cases when the network connectivity of a few of these bad datanodes were degraded, resulting in long long times (in the order of two hours) to transfer blocks to and from these datanodes.

A similar issue arises when disks a single disk on a datanode fail or change to read-only mode: in this case the entire datanode shuts down.

HDFS should detect and handle network and disk performance degradation more gracefully. One option would be to blacklist these datanodes, de-prioritise their use and alert the administrator.

Attachments

Issue Links

duplicates

HADOOP-2830 Need to instrument Hadoop to get comprehensive network traffic metrics

Resolved

HDFS-324 Generate a network infrastructre map

Resolved

relates to

HDFS-97 DFS should detect slow links(nodes) and avoid them

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Dhruba Borthakur

Votes:: 2 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 06/Mar/08 19:58

Updated:: 30/Jul/14 08:09