[HDFS-1526] Dfs client name for a map/reduce task should have some randomness - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.23.0
Component/s: hdfs-client
Labels:
None

Hadoop Flags:

Incompatible change, Reviewed
Release Note:
Make a client name has this format: DFSClient_applicationid_randomint_threadid, where applicationid = mapred.task.id or else = "NONMAPREDUCE".

Description

Fsck shows one of the files in our dfs cluster is corrupt.

/bin/hadoop fsck aFile -files -blocks -locations
aFile: 4633 bytes, 2 block(s):
aFile: CORRUPT block blk_-4597378336099313975
OK
0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT

On disk, these two blocks are of the same size and the same content. It turns out the writer of the file is from a multiple threaded map task. Each thread may write to the same file. One possible interaction of two threads might make this to happen:
[T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to aFile][T2: addBlock1 to aFile]...

Because T1 and T2 have the same client name, which is the map task id, the above interactions could be done without any lease exception, thus eventually leading to a corrupt file. To solve the problem, a mapreduce task's client name could be formed by its task id followed by a random number.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

randClientId3.patch
13/Dec/10 23:15
1 kB
Hairong Kuang
randClientId2.patch
11/Dec/10 08:15
0.9 kB
Hairong Kuang
randClientId1.patch
10/Dec/10 23:54
13 kB
Hairong Kuang
clientName.patch
03/Dec/10 08:20
0.5 kB
Hairong Kuang

Activity

People

Assignee:: Hairong Kuang

Reporter:: Hairong Kuang

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 03/Dec/10 07:13

Updated:: 18/Dec/11 13:34

Resolved:: 14/Dec/10 17:47