Details
-
Bug
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
In one of my testing that when I set mapreduce.task.io.sort.factor to 1, all the maps hang and will never end. But the CPU usage for each node are very high and until killed by the app master when time out comes, and the job failed.
I traced the problem and found out that all the maps hangs on the final merge phase.
The while loop in computeBytesInMerges will never end with a factor of 1:
int f = 1; //in my case
int n = 16; //in my case
while (n > f || considerFinalMerge) {
...
n -= (f-1);
f = factor;
}
As the f-1 will equals 0 and n will always be 16 and the while runs for ever.