The slow-start of the bloom filter vertex is a scheduling problem which causes more pre-emption than is useful.
When the bloom filters are arranged as follows
Map 1(10 tasks)
>Reducer 2(1 task)>Map 3(100 tasks)
Map 3 and Map 1 are immediately active since Reducer 2 -> Map 3 is a broadcast edge.
Once 3 tasks in Map 1 finish, the engine kills one active task from Map 3 to make room for Reducer 2.