[HDFS-350] DFSClient more robust if the namenode is busy doing GC - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Not A Problem
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

In the current code, if the client (writer) encounters an RPC error while fetching a new block id from the namenode, it does not retry. It throws an exception to the application. This becomes especially bad if the namenode is in the middle of a GC and does not respond in time. The reason the client throws an exception is because it does not know whether the namenode successfully allocated a block for this file.

One possible enhancement would be to make the client retry the addBlock RPC if needed. The client can send the block list that it currently has. The namenode can match the block list send by the client with what it has in its own metadata and then send back a new blockid (or a previously allocated blockid that the client had not yet received because the earlier RPC timedout). This will make the client more robust!

This works even when we support Appends because the namenode will always verify that the client has the lease for the file in question.

Attachments

Issue Links

is related to

HADOOP-2647 dfs -put hangs

Resolved

Activity

People

Assignee:: Dhruba Borthakur

Reporter:: Dhruba Borthakur

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 04/Feb/08 18:08

Updated:: 22/Mar/16 21:30

Resolved:: 22/Mar/16 21:30