[HADOOP-5034] NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.18.0
Fix Version/s: 0.19.1
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
This patch changes the DatanodeProtocoal version number from 18 to 19. The patch allows NameNode to send both block replication and deletion request to a DataNode in response to a heartbeat.

Description

Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request.

This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.

I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

blockTransferInvalidate.patch
21/Jan/09 21:12
12 kB
Hairong Kuang
blockTransferInvalidate1.patch
22/Jan/09 01:12
14 kB
Hairong Kuang
blockTransferInvalidate2.patch
26/Jan/09 20:04
9 kB
Hairong Kuang
blockTransferInvalidate3.patch
29/Jan/09 01:27
14 kB
Hairong Kuang

Activity

People

Assignee:: Hairong Kuang

Reporter:: Hairong Kuang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 14/Jan/09 19:30

Updated:: 08/Jul/09 16:43

Resolved:: 02/Feb/09 19:16