Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5019

Fair scheduler should allow peremption on reducer only



    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Not A Problem
    • 2.0.2-alpha
    • None
    • mrv1, scheduler
    • CDH4.1.2


      Fair scheduler is very good.
      But having a big MR job running lots of mapper and reducer( 10M + 10R )
      Then a small MR on the same pool (1M + 1R)
      having slots for 10 mapper and 10 reducer

      • The big job take all the map slots
      • The small job wait for a map slot
      • 1rst big job map task finish
      • the small job take the map slot it needs
      • meanwhile all the reducer of the big job take all the reducer slot to copy and sort
      • the small job end is map and wait for the all maps to end and for 1 reducer to end before accessing for a reducer slot.
      • all the reducer stalled after sorting waiting for the mapper to end one by one...

      If I have a big job and a lot of small, I don't want new small arriving and killing running map tasks of big job to get a slot.

      I think it could be useful that the small job can kill a reducer tasks (and only reducer) to end before the big job finish all its map tasks and a reducer.

      rules can be : a job having all its map finished and waiting for reducer slot can kill reducer tasks from a job that still have map slot running (assuming they are just waiting for copy and sort)




            Unassigned Unassigned
            dam_ned Damien Hardy
            0 Vote for this issue
            4 Start watching this issue