Issue Details (XML | Word | Printable)

Key: HADOOP-3136
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Arun C Murthy
Reporter: Devaraj Das
Votes: 0
Watchers: 18
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Assign multiple tasks per TaskTracker heartbeat

Created: 31/Mar/08 01:06 PM   Updated: 08/Jul/09 04:52 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.20.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-3136_0_20080805.patch 2008-08-05 09:47 AM Arun C Murthy 8 kB
Text File Licensed for inclusion in ASF works HADOOP-3136_1_20080809.patch 2008-08-09 07:30 PM Arun C Murthy 7 kB
Text File Licensed for inclusion in ASF works HADOOP-3136_2_20080911.patch 2008-09-12 05:25 AM Arun C Murthy 11 kB
Text File Licensed for inclusion in ASF works HADOOP-3136_3_20081211.patch 2008-12-12 09:44 AM Arun C Murthy 24 kB
Text File Licensed for inclusion in ASF works HADOOP-3136_4_20081212.patch 2008-12-13 12:48 AM Arun C Murthy 41 kB
Text File Licensed for inclusion in ASF works HADOOP-3136_5_20081215.patch 2008-12-15 11:36 PM Arun C Murthy 45 kB

Resolution Date: 16/Dec/08 09:56 AM


 Description  « Hide
In today's logic of finding a new task, we assign only one task per heartbeat.

We probably could give the tasktracker multiple tasks subject to the max number of free slots it has - for maps we could assign it data local tasks. We could probably run some logic to decide what to give it if we run out of data local tasks (e.g., tasks from overloaded racks, tasks that have least locality, etc.). In addition to maps, if it has reduce slots free, we could give it reduce task(s) as well. Again for reduces we could probably run some logic to give more tasks to nodes that are closer to nodes running most maps (assuming data generated is proportional to the number of maps). For e.g., if rack1 has 70% of the input splits, and we know that most maps are data/rack local, we try to schedule ~70% of the reducers there.

Thoughts?



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Devaraj Das made changes - 31/Mar/08 01:08 PM
Field Original Value New Value
Description In today's logic of finding a new task, we assign only one task per heartbeat.

We probably could give the tasktracker multiple tasks subject to the max number of free slots it has - for maps we could assign it data local tasks. We could probably run some logic to decide what to give it if we run out of data local tasks (e.g., tasks from overloaded racks, tasks that have least locality, etc.). In addition to maps, if it has reduce slots free, we could give it reduce task(s) as well. Again for reduces we could probably run some logic to give more tasks to nodes that are closer to nodes running most maps (assuming data generated is proportional to the number of maps). For e.g., if rack1 has 70% of the input splits, we try to schedule ~70% of the reducers there.

Thoughts?
In today's logic of finding a new task, we assign only one task per heartbeat.

We probably could give the tasktracker multiple tasks subject to the max number of free slots it has - for maps we could assign it data local tasks. We could probably run some logic to decide what to give it if we run out of data local tasks (e.g., tasks from overloaded racks, tasks that have least locality, etc.). In addition to maps, if it has reduce slots free, we could give it reduce task(s) as well. Again for reduces we could probably run some logic to give more tasks to nodes that are closer to nodes running most maps (assuming data generated is proportional to the number of maps). For e.g., if rack1 has 70% of the input splits, and we know that most maps are data/rack local, we try to schedule ~70% of the reducers there.

Thoughts?
Mukund Madhugiri made changes - 07/Jun/08 01:26 AM
Fix Version/s 0.18.0 [ 12312972 ]
Arun C Murthy made changes - 29/Jul/08 05:00 AM
Assignee Arun C Murthy [ acmurthy ]
Arun C Murthy made changes - 29/Jul/08 05:00 AM
Fix Version/s 0.19.0 [ 12313211 ]
Arun C Murthy made changes - 05/Aug/08 09:47 AM
Attachment HADOOP-3136_0_20080805.patch [ 12387548 ]
Arun C Murthy made changes - 09/Aug/08 07:30 PM
Attachment HADOOP-3136_1_20080809.patch [ 12387886 ]
Arun C Murthy made changes - 09/Aug/08 07:30 PM
Status Open [ 1 ] Patch Available [ 10002 ]
Arun C Murthy made changes - 25/Aug/08 05:05 PM
Status Patch Available [ 10002 ] Open [ 1 ]
Arun C Murthy made changes - 12/Sep/08 05:25 AM
Attachment HADOOP-3136_2_20080911.patch [ 12389984 ]
Arun C Murthy made changes - 12/Sep/08 05:25 AM
Status Open [ 1 ] Patch Available [ 10002 ]
Arun C Murthy made changes - 15/Sep/08 08:51 AM
Status Patch Available [ 10002 ] Open [ 1 ]
Robert Chansler made changes - 22/Sep/08 08:03 PM
Fix Version/s 0.19.0 [ 12313211 ]
Arun C Murthy made changes - 23/Sep/08 09:18 PM
Fix Version/s 0.20.0 [ 12313438 ]
Arun C Murthy made changes - 12/Dec/08 09:44 AM
Attachment HADOOP-3136_3_20081211.patch [ 12395925 ]
Arun C Murthy made changes - 13/Dec/08 12:48 AM
Attachment HADOOP-3136_4_20081212.patch [ 12395984 ]
Arun C Murthy made changes - 13/Dec/08 12:48 AM
Status Open [ 1 ] Patch Available [ 10002 ]
Arun C Murthy made changes - 15/Dec/08 10:40 AM
Status Patch Available [ 10002 ] Open [ 1 ]
Nigel Daley made changes - 15/Dec/08 07:51 PM
Fix Version/s 0.20.0 [ 12313438 ]
Arun C Murthy made changes - 15/Dec/08 11:36 PM
Attachment HADOOP-3136_5_20081215.patch [ 12396139 ]
Arun C Murthy made changes - 16/Dec/08 09:32 AM
Fix Version/s 0.20.0 [ 12313438 ]
Status Open [ 1 ] Patch Available [ 10002 ]
Arun C Murthy made changes - 16/Dec/08 09:56 AM
Status Patch Available [ 10002 ] Resolved [ 5 ]
Resolution Fixed [ 1 ]
Nigel Daley made changes - 23/Apr/09 07:17 PM
Status Resolved [ 5 ] Closed [ 6 ]
Owen O'Malley made changes - 08/Jul/09 04:52 PM
Component/s mapred [ 12310690 ]