Issue Details (XML | Word | Printable)

Key: HADOOP-3297
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Devaraj Das
Reporter: Devaraj Das
Votes: 0
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

The way in which ReduceTask/TaskTracker gets completion events during shuffle can be improved

Created: 22/Apr/08 10:24 AM   Updated: 08/Jul/09 04:52 PM
Component/s: None
Affects Version/s: None
Fix Version/s: 0.18.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 3297.patch 2008-05-06 11:03 AM Devaraj Das 10 kB
Text File Licensed for inclusion in ASF works 3297.patch 2008-05-02 01:13 PM Devaraj Das 10 kB
Issue Links:
Reference
 
dependent
 

Hadoop Flags: Reviewed
Resolution Date: 07/May/08 11:02 PM


 Description  « Hide
Certain things like poll frequency, number of events fetched in one go, etc. can probably be improved to improve the shuffle performance. This would affect the task->tasktracker and the tasktracker->jobtracker shuffle related RPCs.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Devaraj Das added a comment - 28/Apr/08 01:32 PM
Here is a proposal after a discussion with Sameer:
1) The TaskTracker polls the JobTracker asking for 500 task completion events. If it gets the full payload, it immediately asks for another bunch of 500 and so on. When it gets less than 500, it switches to current behavior - sleep for a fixed amount of time (heartbeat interval). A small number of events per RPC would ensure that each RPC takes a lesser amount of time although the number of RPCs would be more.
2) The Task asks for 10000 events at a time every second from the TaskTracker.

Devaraj Das added a comment - 28/Apr/08 02:21 PM
An interesting observation regarding using the ramfs. I guess i should raise a separate jira but let me put it here anyway -
I had a job (loadgen from hadoop-test) consisting of 2500 maps and 1 reducer. The ramfs size was 300MB and io.sort.factor was 100. The cluster had 20 nodes. Each map generated 5 MB of data. The amount of time it took to complete the job was 45 minutes (with the above changes). The number of files that missed the ramfs and ended up on disk was ~2000.
I ran the same job (with exactly the same config) with the reducer throttled - if a ramfs merge is on, it would wait for that to complete before fetching anything new. This basically results in all files ending up in the ramfs. The job ran in 30 minutes.

So although I didn't notice any significant performance gain for this job with the shuffle protocol changes as proposed in my last comment but in general it looks like this is going to be true - for a given job, if we have a faster shuffle, more files get created on the disk, and depending on the number/size of map outputs for the job, this might adversely affects the final merge, thereby affecting the overall runtime of the job.

I will see if the above behavior can be modelled.


Runping Qi added a comment - 28/Apr/08 04:37 PM

Under what condition fetched map outputs will end up on disk directly?
If a segment is very large, it makes sense to write it out on disk directly.
If it is one or the last few, it makes sense too. Otherwise, a fetched segment
should get into in-mem file system. If the in-mem file is full, the fetcher should wait.

This is related to hadoop-2095. They should be considered together.


Devaraj Das added a comment - 02/May/08 01:13 PM
I ran a benchmark (loadgen) with the attached patch. Here are the details:
1) Num maps - 10000
2) Size of each map output - 1KB
3) Size of cluster - 80 nodes
4) Num reducers - 1

With the patch, the run took ~7 minutes. On trunk, the same job took ~11 minutes.


Runping Qi added a comment - 02/May/08 01:23 PM
How long did the map phase take?

Devaraj Das added a comment - 02/May/08 01:32 PM
The map phase took roughly 3 minutes

Mahadev konar added a comment - 06/May/08 05:11 AM
this patch does not include waiting on the reducer fetch if the memory fs is full ? or does it ?

Devaraj Das added a comment - 06/May/08 05:13 AM
No it doesn't. That should be outside the scope of this one.

Mahadev konar added a comment - 06/May/08 05:18 AM
agreed.

Hadoop QA added a comment - 06/May/08 07:50 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381308/3297.patch
against trunk revision 653638.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

-1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2406/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2406/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2406/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2406/console

This message is automatically generated.


Devaraj Das added a comment - 06/May/08 11:03 AM
Fixed findbugs warnings. The test failure is not related to the patch.

Devaraj Das added a comment - 06/May/08 11:03 AM
Retrying hudson

Hadoop QA added a comment - 06/May/08 05:06 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381496/3297.patch
against trunk revision 653749.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2412/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2412/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2412/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2412/console

This message is automatically generated.


Mahadev konar added a comment - 07/May/08 07:25 AM
the patch looks good... the only concern I have is that if we want to check and see that it does not degrade perofrmace and lead to more problems at the jobtracker or otherwise.
We could try running sort with this patch or some map reduce job where we have a huge number of mappers say 100,000 and 500 reducers or something.

This is just to verify that the task tracker being agressive in fetching the mapoutputs does not degrade performance. The maps are short lived, so its not hard to imagine a situation that all the reduces start bombarding the jobtracker with requests for maps at the same time asking for more. We should check to see if the jobtracker can handle the load and the performance in such a situation does not degrade.


Devaraj Das added a comment - 07/May/08 12:06 PM
Yes, I ran large jobs with 250000 maps and 400 reducers on 250 nodes and saw no performance issues.

Mahadev konar added a comment - 07/May/08 04:53 PM
thats great...

+1 for commit.


Owen O'Malley added a comment - 07/May/08 11:02 PM
I just committed this. Thanks, Devaraj!

Hudson added a comment - 08/May/08 12:23 PM

Runping Qi added a comment - 10/May/08 06:04 AM

This patch should be in release 17.
Without it, the shuffling phase will be painfully slow.


Runping Qi added a comment - 10/May/08 06:08 AM - edited
This jira was created because of hadoop-3327