Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6302

Preempt reducers after a configurable timeout irrespective of headroom

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None

      Description

      I submit a big job, which has 500 maps and 350 reduce, to a queue(fairscheduler) with 300 max cores. When the big mapreduce job is running 100% maps, the 300 reduces have occupied 300 max cores in the queue. And then, a map fails and retry, waiting for a core, while the 300 reduces are waiting for failed map to finish. So a deadlock occur. As a result, the job is blocked, and the later job in the queue cannot run because no available cores in the queue.
      I think there is the similar issue for memory of a queue .

      1. AM_log_head100000.txt.gz
        830 kB
        mai shurong
      2. AM_log_tail100000.txt.gz
        946 kB
        mai shurong
      3. log.txt
        13 kB
        Benjamin Tortorelli
      4. MAPREDUCE-6302.branch-2.6.0001.patch
        22 kB
        Wangda Tan
      5. MAPREDUCE-6302.branch-2.7.0001.patch
        22 kB
        Wangda Tan
      6. mr-6302_branch-2.patch
        24 kB
        Karthik Kambatla
      7. mr-6302-1.patch
        20 kB
        Karthik Kambatla
      8. mr-6302-2.patch
        22 kB
        Karthik Kambatla
      9. mr-6302-3.patch
        22 kB
        Karthik Kambatla
      10. mr-6302-4.patch
        23 kB
        Karthik Kambatla
      11. mr-6302-5.patch
        23 kB
        Karthik Kambatla
      12. mr-6302-6.patch
        23 kB
        Karthik Kambatla
      13. mr-6302-7.patch
        23 kB
        Karthik Kambatla
      14. mr-6302-prelim.patch
        8 kB
        Karthik Kambatla
      15. queue_with_max163cores.png
        14 kB
        mai shurong
      16. queue_with_max263cores.png
        15 kB
        mai shurong
      17. queue_with_max333cores.png
        13 kB
        mai shurong

        Issue Links

          Activity

          Hide
          rohithsharma Rohith Sharma K S added a comment -

          Would you mind attaching AM logs?

          Show
          rohithsharma Rohith Sharma K S added a comment - Would you mind attaching AM logs?
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          And then, a map fails and retry, waiting for a core, while the 300 reduces are waiting for failed map to finish

          When there is any failed maps, if all the reducers are ocupied the resources then ideally the reducer pre emption should be triggered. AM logs would give some information about problem.

          Show
          rohithsharma Rohith Sharma K S added a comment - And then, a map fails and retry, waiting for a core, while the 300 reduces are waiting for failed map to finish When there is any failed maps, if all the reducers are ocupied the resources then ideally the reducer pre emption should be triggered. AM logs would give some information about problem.
          Hide
          shurong.mai mai shurong added a comment -

          There are methods to work around: temporarily increase max cores in the queue, or kill the job, but the problem is a bug.

          Show
          shurong.mai mai shurong added a comment - There are methods to work around: temporarily increase max cores in the queue, or kill the job, but the problem is a bug.
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          I suspect this scenario would be same as YARN-1680. Would you confirm please?

          Show
          rohithsharma Rohith Sharma K S added a comment - I suspect this scenario would be same as YARN-1680 . Would you confirm please?
          Hide
          shurong.mai mai shurong added a comment -

          In YARN-1680,there are only 4 NodeManagers in cluster, so it is possible all 4 NodeManagers are in the blacklist. But in my case, there are more than 50 NodeManagers and over 1000 vcores in cluster. Therefore, it is hardly probable all NodeManagers in cluster are in blacklist.

          Show
          shurong.mai mai shurong added a comment - In YARN-1680 ,there are only 4 NodeManagers in cluster, so it is possible all 4 NodeManagers are in the blacklist. But in my case, there are more than 50 NodeManagers and over 1000 vcores in cluster. Therefore, it is hardly probable all NodeManagers in cluster are in blacklist.
          Hide
          shurong.mai mai shurong added a comment -

          The job ran two days ago, now the AM logs is deleted. I try to make the problem occur again and attach the AM log.

          Show
          shurong.mai mai shurong added a comment - The job ran two days ago, now the AM logs is deleted. I try to make the problem occur again and attach the AM log.
          Hide
          shurong.mai mai shurong added a comment -

          The AM logs are repeately printed such as follows:

          2015-03-30 15:06:47,404 INFO [IPC Server handler 21 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000128_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,404 INFO [IPC Server handler 8 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000230_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,409 INFO [IPC Server handler 17 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000162_0 is : 0.33234128
          2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000057_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000190_0 is : 0.33234128
          2015-03-30 15:06:47,409 INFO [IPC Server handler 18 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000200_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,413 INFO [IPC Server handler 27 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000195_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,415 INFO [IPC Server handler 16 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000134_0 is : 0.33234128
          2015-03-30 15:06:47,424 INFO [IPC Server handler 29 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000159_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,432 INFO [IPC Server handler 4 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000062_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,432 INFO [IPC Server handler 14 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000172_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,438 INFO [IPC Server handler 5 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000229_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,442 INFO [IPC Server handler 3 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000184_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,442 INFO [IPC Server handler 6 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000153_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,450 INFO [IPC Server handler 26 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000214_0 is : 0.33234128
          2015-03-30 15:06:47,457 INFO [IPC Server handler 12 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000058_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 0 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000160_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 19 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000156_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 2 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000149_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 23 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000185_0 is : 0.33234128
          ...
          ...

          Show
          shurong.mai mai shurong added a comment - The AM logs are repeately printed such as follows: 2015-03-30 15:06:47,404 INFO [IPC Server handler 21 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000128_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,404 INFO [IPC Server handler 8 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000230_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,409 INFO [IPC Server handler 17 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000162_0 is : 0.33234128 2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000057_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000190_0 is : 0.33234128 2015-03-30 15:06:47,409 INFO [IPC Server handler 18 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000200_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,413 INFO [IPC Server handler 27 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000195_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,415 INFO [IPC Server handler 16 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000134_0 is : 0.33234128 2015-03-30 15:06:47,424 INFO [IPC Server handler 29 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000159_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,432 INFO [IPC Server handler 4 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000062_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,432 INFO [IPC Server handler 14 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000172_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,438 INFO [IPC Server handler 5 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000229_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,442 INFO [IPC Server handler 3 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000184_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,442 INFO [IPC Server handler 6 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000153_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,450 INFO [IPC Server handler 26 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000214_0 is : 0.33234128 2015-03-30 15:06:47,457 INFO [IPC Server handler 12 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000058_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 0 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000160_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 19 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000156_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 2 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000149_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 23 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000185_0 is : 0.33234128 ... ...
          Hide
          shurong.mai mai shurong added a comment -

          The AM logs are repeately printed such as follows:

          2015-03-30 15:06:47,404 INFO [IPC Server handler 21 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000128_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,404 INFO [IPC Server handler 8 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000230_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,409 INFO [IPC Server handler 17 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000162_0 is : 0.33234128
          2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000057_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000190_0 is : 0.33234128
          2015-03-30 15:06:47,409 INFO [IPC Server handler 18 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000200_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,413 INFO [IPC Server handler 27 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000195_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,415 INFO [IPC Server handler 16 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000134_0 is : 0.33234128
          2015-03-30 15:06:47,424 INFO [IPC Server handler 29 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000159_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,432 INFO [IPC Server handler 4 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000062_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,432 INFO [IPC Server handler 14 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000172_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,438 INFO [IPC Server handler 5 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000229_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,442 INFO [IPC Server handler 3 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000184_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,442 INFO [IPC Server handler 6 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000153_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,450 INFO [IPC Server handler 26 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000214_0 is : 0.33234128
          2015-03-30 15:06:47,457 INFO [IPC Server handler 12 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000058_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 0 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000160_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 19 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000156_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 2 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000149_0. startIndex 338 maxEvents 5172
          2015-03-30 15:06:47,466 INFO [IPC Server handler 23 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000185_0 is : 0.33234128
          ...
          ...

          Show
          shurong.mai mai shurong added a comment - The AM logs are repeately printed such as follows: 2015-03-30 15:06:47,404 INFO [IPC Server handler 21 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000128_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,404 INFO [IPC Server handler 8 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000230_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,409 INFO [IPC Server handler 17 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000162_0 is : 0.33234128 2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000057_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,409 INFO [IPC Server handler 7 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000190_0 is : 0.33234128 2015-03-30 15:06:47,409 INFO [IPC Server handler 18 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000200_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,413 INFO [IPC Server handler 27 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000195_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,415 INFO [IPC Server handler 16 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000134_0 is : 0.33234128 2015-03-30 15:06:47,424 INFO [IPC Server handler 29 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000159_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,432 INFO [IPC Server handler 4 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000062_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,432 INFO [IPC Server handler 14 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000172_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,438 INFO [IPC Server handler 5 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000229_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,442 INFO [IPC Server handler 3 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000184_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,442 INFO [IPC Server handler 6 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000153_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,450 INFO [IPC Server handler 26 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000214_0 is : 0.33234128 2015-03-30 15:06:47,457 INFO [IPC Server handler 12 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000058_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 0 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000160_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 19 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000156_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 2 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1427451307913_0005_r_000149_0. startIndex 338 maxEvents 5172 2015-03-30 15:06:47,466 INFO [IPC Server handler 23 on 38002] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1427451307913_0005_r_000185_0 is : 0.33234128 ... ...
          Hide
          rchiang Ray Chiang added a comment -

          There probably is a bug here, but what value do you have for the property:

          mapreduce.job.reduce.slowstart.completedmaps

          in mapred-default.xml? If it's close to 0.0, I'd possibly suggest increasing it closer to 1.0 in order to keep the number of pending reducers down. This will likely have a performance hit, but should at least allow your job to complete.

          Show
          rchiang Ray Chiang added a comment - There probably is a bug here, but what value do you have for the property: mapreduce.job.reduce.slowstart.completedmaps in mapred-default.xml? If it's close to 0.0, I'd possibly suggest increasing it closer to 1.0 in order to keep the number of pending reducers down. This will likely have a performance hit, but should at least allow your job to complete.
          Hide
          shurong.mai mai shurong added a comment -

          mapreduce.job.reduce.slowstart.completedmaps is 0.5

          Show
          shurong.mai mai shurong added a comment - mapreduce.job.reduce.slowstart.completedmaps is 0.5
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          there are only 4 NodeManagers in cluster, so it is possible all 4 NodeManagers are in the blacklist

          In yarn-1680, all the NM's were not in the blacklist. Only 1 NM is blacklisted. This scenario can happen in larger cluster also. I have observed similar issue in 25 nodes cluster also.
          The reason for suspect would be same as yarn-1680 is in your cluster 300 reducers are running which occupied 300 cores. It means there is no place for running mappers. But at this moment if any reducers dont get mapper output(any reason), the mapper is marked as failure and the nodes are blacklisted. Blacklisted nodes has some resources which can run some of the containers. In MR, reducer preempt is decided on several factors, out of that one factoe is headroom. But RM sends headroom considering blacklisted nodes which causes MR not to trigger reducer preemption. I am suspecting only. There would be real potential hidden bug also. If you provide full AM logs, I can help you in analyzing whether it is same as yarn-1680 or not.

          Show
          rohithsharma Rohith Sharma K S added a comment - there are only 4 NodeManagers in cluster, so it is possible all 4 NodeManagers are in the blacklist In yarn-1680, all the NM's were not in the blacklist. Only 1 NM is blacklisted. This scenario can happen in larger cluster also. I have observed similar issue in 25 nodes cluster also. The reason for suspect would be same as yarn-1680 is in your cluster 300 reducers are running which occupied 300 cores. It means there is no place for running mappers. But at this moment if any reducers dont get mapper output(any reason), the mapper is marked as failure and the nodes are blacklisted. Blacklisted nodes has some resources which can run some of the containers. In MR, reducer preempt is decided on several factors, out of that one factoe is headroom. But RM sends headroom considering blacklisted nodes which causes MR not to trigger reducer preemption. I am suspecting only. There would be real potential hidden bug also. If you provide full AM logs, I can help you in analyzing whether it is same as yarn-1680 or not.
          Hide
          shurong.mai mai shurong added a comment -

          attachment

          Show
          shurong.mai mai shurong added a comment - attachment
          Hide
          shurong.mai mai shurong added a comment -

          head 100000 lines and tail 100000 lines of AM log of a deadlock job.

          Show
          shurong.mai mai shurong added a comment - head 100000 lines and tail 100000 lines of AM log of a deadlock job.
          Hide
          shurong.mai mai shurong added a comment -

          queue_with_max163cores.png : submit a job to a queue with max 163 cores
          queue_with_max263cores.png : submit a job to a queue with max 263 cores
          queue_with_max333cores.png : submit a job to a queue with max 333 cores

          Show
          shurong.mai mai shurong added a comment - queue_with_max163cores.png : submit a job to a queue with max 163 cores queue_with_max263cores.png : submit a job to a queue with max 263 cores queue_with_max333cores.png : submit a job to a queue with max 333 cores
          Hide
          shurong.mai mai shurong added a comment -

          I found a new case today. I submitted a more larger job with 5800 maps and 380 reduces to a queue which has max 263 cores. Even though no map fail, a deadlock of map and reduce cores allocation always occured when I tried several times. And I tried to submitted to other queues, as long as reduces of a job is more than max cores of the queue , deadlock always happened.
          I attach the printscreens of deadlock jobs, and attach the head 100000 line log (AM_log_head100000.txt.gz) and tail 100000 line (AM_log_tail100000.txt.gz) of AM log of one deadlock job.

          The parameter mapreduce.job.reduce.slowstart.completedmaps is 0.5.

          Show
          shurong.mai mai shurong added a comment - I found a new case today. I submitted a more larger job with 5800 maps and 380 reduces to a queue which has max 263 cores. Even though no map fail, a deadlock of map and reduce cores allocation always occured when I tried several times. And I tried to submitted to other queues, as long as reduces of a job is more than max cores of the queue , deadlock always happened. I attach the printscreens of deadlock jobs, and attach the head 100000 line log (AM_log_head100000.txt.gz) and tail 100000 line (AM_log_tail100000.txt.gz) of AM log of one deadlock job. The parameter mapreduce.job.reduce.slowstart.completedmaps is 0.5.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Moved to mapreduce. And mai shurong, could you confirm the Hadoop version you're currently using?

          Show
          leftnoteasy Wangda Tan added a comment - Moved to mapreduce. And mai shurong , could you confirm the Hadoop version you're currently using?
          Hide
          shurong.mai mai shurong added a comment -

          The version is hadoop-2.6.0

          Show
          shurong.mai mai shurong added a comment - The version is hadoop-2.6.0
          Hide
          shurong.mai mai shurong added a comment -

          When I set parameter mapreduce.job.reduce.slowstart.completedmaps to 0.5, jobs could always run successfully and the deadlock didnot happen any more.

          Show
          shurong.mai mai shurong added a comment - When I set parameter mapreduce.job.reduce.slowstart.completedmaps to 0.5, jobs could always run successfully and the deadlock didnot happen any more.
          Hide
          shurong.mai mai shurong added a comment -

          Sorry, I made a clerical mistake in the comment. The mapreduce.job.reduce.slowstart.completedmaps is 1.0.
          When I set parameter mapreduce.job.reduce.slowstart.completedmaps to 1.0, jobs could always run successfully and the deadlock didnot happen any more.

          Show
          shurong.mai mai shurong added a comment - Sorry, I made a clerical mistake in the comment. The mapreduce.job.reduce.slowstart.completedmaps is 1.0. When I set parameter mapreduce.job.reduce.slowstart.completedmaps to 1.0, jobs could always run successfully and the deadlock didnot happen any more.
          Hide
          kasha Karthik Kambatla added a comment -

          We have seen this issue too.

          Show
          kasha Karthik Kambatla added a comment - We have seen this issue too.
          Hide
          kasha Karthik Kambatla added a comment -

          I suspect this is a FairScheduler issue with Fifo or Fairshare policies. mai shurong - what policy are you using for your queues? Can you try using DRF and verify if the problem persists.

          Show
          kasha Karthik Kambatla added a comment - I suspect this is a FairScheduler issue with Fifo or Fairshare policies. mai shurong - what policy are you using for your queues? Can you try using DRF and verify if the problem persists.
          Hide
          kasha Karthik Kambatla added a comment -

          Filed YARN-3485 to fix the FairScheduler issue. In addition to that fix, I wonder if we should improve the MapReduce side behavior as well.

          MAPREDUCE-5844 adds the notion of "hanging" requests to kickstart preemption, but it appears that kicks in only when the headroom doesn't show enough resources to run containers. How about generalizing this to preempt containers in cases where there appears to be headroom, but the scheduler is unable to hand them to the app for some reason? In other words, I guess I am proposing MR use the headroom from YARN more as a heuristic than an absolute guarantee. MR should use the resources given to it in the best possible way it can.

          Show
          kasha Karthik Kambatla added a comment - Filed YARN-3485 to fix the FairScheduler issue. In addition to that fix, I wonder if we should improve the MapReduce side behavior as well. MAPREDUCE-5844 adds the notion of "hanging" requests to kickstart preemption, but it appears that kicks in only when the headroom doesn't show enough resources to run containers. How about generalizing this to preempt containers in cases where there appears to be headroom, but the scheduler is unable to hand them to the app for some reason? In other words, I guess I am proposing MR use the headroom from YARN more as a heuristic than an absolute guarantee. MR should use the resources given to it in the best possible way it can.
          Hide
          leftnoteasy Wangda Tan added a comment -

          In other words, I guess I am proposing MR use the headroom from YARN more as a heuristic than an absolute guarantee. MR should use the resources given to it in the best possible way it can.

          +1 to make the headroom more heuristic, actually it can only be "heuristic", YARN RM cannot precisely know what's the headroom of an app in most cases. Changing it to "heuristic" can avoid lots of deadlock between mappers and reducers like this.

          Show
          leftnoteasy Wangda Tan added a comment - In other words, I guess I am proposing MR use the headroom from YARN more as a heuristic than an absolute guarantee. MR should use the resources given to it in the best possible way it can. +1 to make the headroom more heuristic, actually it can only be "heuristic", YARN RM cannot precisely know what's the headroom of an app in most cases. Changing it to "heuristic" can avoid lots of deadlock between mappers and reducers like this.
          Hide
          peng.zhang Peng Zhang added a comment -

          in head log it shows
          Number of reduces for job job_1427451307913_0012 = 380

          in tail log it shows
          headroom=<memory:333824, vCores:661> and totalResourceLimit:<memory:667648, vCores:824>

          It seems conflict with statement “as long as reduces of a job is more than max cores of the queue , deadlock always happened. ”

          Show
          peng.zhang Peng Zhang added a comment - in head log it shows Number of reduces for job job_1427451307913_0012 = 380 in tail log it shows headroom=<memory:333824, vCores:661> and totalResourceLimit:<memory:667648, vCores:824> It seems conflict with statement “as long as reduces of a job is more than max cores of the queue , deadlock always happened. ”
          Hide
          kasha Karthik Kambatla added a comment -

          Uploading preliminary patch that captures my proposal here.

          TestRMContainerAllocator fails today. Will fix that and add tests to verify the fix here.

          Show
          kasha Karthik Kambatla added a comment - Uploading preliminary patch that captures my proposal here. TestRMContainerAllocator fails today. Will fix that and add tests to verify the fix here.
          Hide
          bent Benjamin Tortorelli added a comment -

          Excerpt from log file.

          Show
          bent Benjamin Tortorelli added a comment - Excerpt from log file.
          Hide
          bent Benjamin Tortorelli added a comment -

          We're seeing this issue as well. Although our job is map only. Some runs seems to hang and have to be killed, others only take a very long amount of time to complete. This occurs with varying numbers of workers and memory. Yarn logs for the job always show one worker with an extremely large log file compared to the other workers (50 MB vs 500 KB).

          Show
          bent Benjamin Tortorelli added a comment - We're seeing this issue as well. Although our job is map only. Some runs seems to hang and have to be killed, others only take a very long amount of time to complete. This occurs with varying numbers of workers and memory. Yarn logs for the job always show one worker with an extremely large log file compared to the other workers (50 MB vs 500 KB).
          Hide
          shurong.mai mai shurong added a comment -

          I used "fair" policy for my queues.

          Show
          shurong.mai mai shurong added a comment - I used "fair" policy for my queues.
          Hide
          shurong.mai mai shurong added a comment -

          I used "fair" policy for my queues.

          Show
          shurong.mai mai shurong added a comment - I used "fair" policy for my queues.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Karthik Kambatla,
          Thanks for working on this.

          Just take a look at your patch, overall approach looks good, some comments about configuration:
          MR_JOB_REDUCER_FORCE_PREEMPT_DELAY_SEC
          It is actually not REDUCER_FORCE_PREEMPT_DELAY, it is timeout of mapper allocation to start reducer preemption, I suggest to rename it to be: mapreduce.job.mapper.timeout-to-start-reducer-preemption-ms. I think it's better to use ms instead of sec to better control it.

          In addition, do you think should we add a value to let user choose to disable this? For example, -1.

          And could you add some tests?

          Show
          leftnoteasy Wangda Tan added a comment - Karthik Kambatla , Thanks for working on this. Just take a look at your patch, overall approach looks good, some comments about configuration: MR_JOB_REDUCER_FORCE_PREEMPT_DELAY_SEC It is actually not REDUCER_FORCE_PREEMPT_DELAY, it is timeout of mapper allocation to start reducer preemption, I suggest to rename it to be: mapreduce.job.mapper.timeout-to-start-reducer-preemption-ms. I think it's better to use ms instead of sec to better control it. In addition, do you think should we add a value to let user choose to disable this? For example, -1. And could you add some tests?
          Hide
          leftnoteasy Wangda Tan added a comment -

          Linked to YARN-1680, one is for more accurate calculation, one is to prevent inaccurate calculation.

          Show
          leftnoteasy Wangda Tan added a comment - Linked to YARN-1680 , one is for more accurate calculation, one is to prevent inaccurate calculation.
          Hide
          zxu zhihai xu added a comment -

          Also linked to YARN-3446 for discussion, YARN-1680 is for CapacityScheduler and YARN-3446 is for FairScheduler. It will be good to merge common functionality into AbstractYarnScheduler.

          Show
          zxu zhihai xu added a comment - Also linked to YARN-3446 for discussion, YARN-1680 is for CapacityScheduler and YARN-3446 is for FairScheduler. It will be good to merge common functionality into AbstractYarnScheduler.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Mark its target version to be 2.8.0

          Show
          leftnoteasy Wangda Tan added a comment - Mark its target version to be 2.8.0
          Hide
          cwelch Craig Welch added a comment -

          Karthik Kambatla this looks good to me overall, it would be great to have this to avoid deadlocks as such, at least as a last resort. I agree with Tan, Wangda that it would be good to be able to disable this with a value like -1, but I think you're setting the default to not be disabled (looks like you have 5 minutes) is the way to go - by default I think it is best to have this active. I actually think seconds is the right granularity here, the time factor of relevance for this sort of activity is really not in the MS range. I also think the naming you have for the configuration option is fine, esp. in contrast to the other (existing) delay it makes sense.

          Do you think you could add a test or two?

          Show
          cwelch Craig Welch added a comment - Karthik Kambatla this looks good to me overall, it would be great to have this to avoid deadlocks as such, at least as a last resort. I agree with Tan, Wangda that it would be good to be able to disable this with a value like -1, but I think you're setting the default to not be disabled (looks like you have 5 minutes) is the way to go - by default I think it is best to have this active. I actually think seconds is the right granularity here, the time factor of relevance for this sort of activity is really not in the MS range. I also think the naming you have for the configuration option is fine, esp. in contrast to the other (existing) delay it makes sense. Do you think you could add a test or two?
          Hide
          kasha Karthik Kambatla added a comment -

          I haven't had a chance to follow up on this. If anyone else wants to follow up, please feel free. I ll try to pick this up late next week, if no one else gets to it.

          Show
          kasha Karthik Kambatla added a comment - I haven't had a chance to follow up on this. If anyone else wants to follow up, please feel free. I ll try to pick this up late next week, if no one else gets to it.
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          Any update on this issue?

          Show
          rohithsharma Rohith Sharma K S added a comment - Any update on this issue?
          Hide
          kasha Karthik Kambatla added a comment -

          Have a couple of hours this morning. Will take a stab at the test. Anubhav Dhoot - hope you haven't started work on it yet.

          Show
          kasha Karthik Kambatla added a comment - Have a couple of hours this morning. Will take a stab at the test. Anubhav Dhoot - hope you haven't started work on it yet.
          Hide
          kasha Karthik Kambatla added a comment -

          Updated patch includes a test to verify the fix. I have made a few other changes to make the related surrounding code a little simpler.

          Show
          kasha Karthik Kambatla added a comment - Updated patch includes a test to verify the fix. I have made a few other changes to make the related surrounding code a little simpler.
          Hide
          kasha Karthik Kambatla added a comment -

          Has been so long since I posted patches, have forgotten to "submit patch".

          Show
          kasha Karthik Kambatla added a comment - Has been so long since I posted patches, have forgotten to "submit patch".
          Hide
          kasha Karthik Kambatla added a comment -

          Forgot to update mapred-default. Updated patch adds documentation for the new property and updates the one for previous property.

          Show
          kasha Karthik Kambatla added a comment - Forgot to update mapred-default. Updated patch adds documentation for the new property and updates the one for previous property.
          Hide
          adhoot Anubhav Dhoot added a comment -

          The patch looks mostly good

          why does availableResourceForMap not consider assignedRequests.maps after the patch?

          The earlier comments had some more description that would be useful to preserve. Maybe as a heading for both set of values to describe when does preemption kick in. For eg the earlier description "The threshold in terms of seconds after which an unsatisfied mapper request triggers reducer preemption to free space."

          Would UNCONDITIONAL be better than FORCE, because its not like the other one is an optional preemption when it kicks in?
          consider
          reverting duration -> allocationDelayThresholdMs
          forcePreemptThreshold -> forcePreemptThresholdSec
          reducerPreemptionHoldMs -> reducerNoHeadroomPreemptionMs

          resourceLimit in allocation is a weird name for the headroom in the Allocation. Consider another jira for fixing that.

          Show
          adhoot Anubhav Dhoot added a comment - The patch looks mostly good why does availableResourceForMap not consider assignedRequests.maps after the patch? The earlier comments had some more description that would be useful to preserve. Maybe as a heading for both set of values to describe when does preemption kick in. For eg the earlier description "The threshold in terms of seconds after which an unsatisfied mapper request triggers reducer preemption to free space." Would UNCONDITIONAL be better than FORCE, because its not like the other one is an optional preemption when it kicks in? consider reverting duration -> allocationDelayThresholdMs forcePreemptThreshold -> forcePreemptThresholdSec reducerPreemptionHoldMs -> reducerNoHeadroomPreemptionMs resourceLimit in allocation is a weird name for the headroom in the Allocation. Consider another jira for fixing that.
          Hide
          kasha Karthik Kambatla added a comment -

          In the previous version, availableResourceForMap was calculated as resourceLimit (= headroom + map_resources + reduce_resources) - map_resources - reduce_resources + reduce_resources_being_preempted. That was unnecessary, this patch updates it to headroom + reduce_resources_being_preempted.

          Would UNCONDITIONAL be better than FORCE, because its not like the other one is an optional preemption when it kicks in?

          Yep, unconditional is more descriptive. Will update the patch.

          Show
          kasha Karthik Kambatla added a comment - In the previous version, availableResourceForMap was calculated as resourceLimit (= headroom + map_resources + reduce_resources) - map_resources - reduce_resources + reduce_resources_being_preempted . That was unnecessary, this patch updates it to headroom + reduce_resources_being_preempted . Would UNCONDITIONAL be better than FORCE, because its not like the other one is an optional preemption when it kicks in? Yep, unconditional is more descriptive. Will update the patch.
          Hide
          kasha Karthik Kambatla added a comment -

          Thanks Anubhav Dhoot for the review. Addressed most of the feedback except:

          1. earlier comments - intentionally left them out. checked them again, they seem to cause more confusion than information.
          2. resourceLimit in Allocation is indeed weird, but changing that would be incompatible change. So, not creating a JIRA yet.
          Show
          kasha Karthik Kambatla added a comment - Thanks Anubhav Dhoot for the review. Addressed most of the feedback except: earlier comments - intentionally left them out. checked them again, they seem to cause more confusion than information. resourceLimit in Allocation is indeed weird, but changing that would be incompatible change. So, not creating a JIRA yet.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 23m 22s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 11m 52s There were no new javac warning messages.
          +1 javadoc 10m 53s There were no new javadoc warning messages.
          -1 release audit 0m 15s The applied patch generated 1 release audit warnings.
          -1 checkstyle 2m 0s The applied patch generated 2 new checkstyle issues (total was 517, now 518).
          -1 checkstyle 2m 24s The applied patch generated 1 new checkstyle issues (total was 12, now 12).
          -1 whitespace 0m 2s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 30s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 4m 5s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 mapreduce tests 9m 42s Tests passed in hadoop-mapreduce-client-app.
          +1 mapreduce tests 1m 48s Tests passed in hadoop-mapreduce-client-core.
          +1 yarn tests 58m 43s Tests passed in hadoop-yarn-server-resourcemanager.
              125m 14s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12764478/mr-6302-3.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 6c17d31
          Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/patchReleaseAuditProblems.txt
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
          whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/whitespace.txt
          hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
          hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 23m 22s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 11m 52s There were no new javac warning messages. +1 javadoc 10m 53s There were no new javadoc warning messages. -1 release audit 0m 15s The applied patch generated 1 release audit warnings. -1 checkstyle 2m 0s The applied patch generated 2 new checkstyle issues (total was 517, now 518). -1 checkstyle 2m 24s The applied patch generated 1 new checkstyle issues (total was 12, now 12). -1 whitespace 0m 2s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 30s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 4m 5s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 9m 42s Tests passed in hadoop-mapreduce-client-app. +1 mapreduce tests 1m 48s Tests passed in hadoop-mapreduce-client-core. +1 yarn tests 58m 43s Tests passed in hadoop-yarn-server-resourcemanager.     125m 14s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12764478/mr-6302-3.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 6c17d31 Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/patchReleaseAuditProblems.txt checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6044/console This message was automatically generated.
          Hide
          kasha Karthik Kambatla added a comment -

          Discussed this with Anubhav Dhoot offline. My latest patch (v3) would lead to preempting reducers even if there are running mappers. The surrounding code looks more complicated than it should be.

          Show
          kasha Karthik Kambatla added a comment - Discussed this with Anubhav Dhoot offline. My latest patch (v3) would lead to preempting reducers even if there are running mappers. The surrounding code looks more complicated than it should be.
          Hide
          kasha Karthik Kambatla added a comment -

          Here is a patch that greatly simplifies preemptReducesIfNecessary.

          Anubhav Dhoot, Jason Lowe - would like to hear your thoughts on whether an extensive change like this is reasonable.

          Show
          kasha Karthik Kambatla added a comment - Here is a patch that greatly simplifies preemptReducesIfNecessary . Anubhav Dhoot , Jason Lowe - would like to hear your thoughts on whether an extensive change like this is reasonable.
          Hide
          jlowe Jason Lowe added a comment -

          I think it's reasonable. There were a number of separate bugs in this area because it was complicated, would be nice to see it simplified and easier to understand.

          Do we really want to avoid any kind of preemption if there's a map running? Thinking of a case where a node failure causes 20 maps to line up for scheduling due to fetch failures and we only have one running. Do we really want to feed those 20 maps through the one map hole? Hope they don't run very long. I haven't studied what the original code did in this case, but I noticed it did not early-out if maps were running, hence the question. I think the preemption logic could benefit from knowing whether reducers have reported whether they're past the SHUFFLE phase and exempt them from preemption. Seems we would want to preempt as many reducers in the SHUFFLE phase as necessary to run most or all pending maps in parallel if possible to minimize job latency for most cases.

          Other minor comments on the patch:

          • docs for mapreduce.job.reducer.unconditional-preempt.delay.sec should be clear on how to disable the functionality if desired, since setting it to zero does some pretty bad things.
          • preemtping s/b preempting
          Show
          jlowe Jason Lowe added a comment - I think it's reasonable. There were a number of separate bugs in this area because it was complicated, would be nice to see it simplified and easier to understand. Do we really want to avoid any kind of preemption if there's a map running? Thinking of a case where a node failure causes 20 maps to line up for scheduling due to fetch failures and we only have one running. Do we really want to feed those 20 maps through the one map hole? Hope they don't run very long. I haven't studied what the original code did in this case, but I noticed it did not early-out if maps were running, hence the question. I think the preemption logic could benefit from knowing whether reducers have reported whether they're past the SHUFFLE phase and exempt them from preemption. Seems we would want to preempt as many reducers in the SHUFFLE phase as necessary to run most or all pending maps in parallel if possible to minimize job latency for most cases. Other minor comments on the patch: docs for mapreduce.job.reducer.unconditional-preempt.delay.sec should be clear on how to disable the functionality if desired, since setting it to zero does some pretty bad things. preemtping s/b preempting
          Hide
          kasha Karthik Kambatla added a comment -

          Thanks for taking a look so quickly, Jason.

          Do we really want to avoid any kind of preemption if there's a map running?

          Fair question. Anubhav had the same comment as well. The other thing to consider here is slowstart: consider slowstart set to a low value (say 0.5) reducers shouldn't be preempted unless there are more than half the mappers pending to be run. We could factor in slowstart into the calculations here. Need to decide if it is worth the additional complication given we are trying to just avoid a deadlock here. May be, file a follow-up and work there? When looking at this code, I noticed a few other things that could be simplified/fixed. e.g. preemptReducer in my patches.

          Will address your comments on the patch once we decide on how to proceed on the above discussion.

          Show
          kasha Karthik Kambatla added a comment - Thanks for taking a look so quickly, Jason. Do we really want to avoid any kind of preemption if there's a map running? Fair question. Anubhav had the same comment as well. The other thing to consider here is slowstart: consider slowstart set to a low value (say 0.5) reducers shouldn't be preempted unless there are more than half the mappers pending to be run. We could factor in slowstart into the calculations here. Need to decide if it is worth the additional complication given we are trying to just avoid a deadlock here. May be, file a follow-up and work there? When looking at this code, I noticed a few other things that could be simplified/fixed. e.g. preemptReducer in my patches. Will address your comments on the patch once we decide on how to proceed on the above discussion.
          Hide
          kasha Karthik Kambatla added a comment -

          I might have forgotten the specifics. Aren't all reducers in SHUFFLE phase until all the mappers are done?

          Show
          kasha Karthik Kambatla added a comment - I might have forgotten the specifics. Aren't all reducers in SHUFFLE phase until all the mappers are done?
          Hide
          jlowe Jason Lowe added a comment -

          Aren't all reducers in SHUFFLE phase until all the mappers are done?

          No, here's an example scenario:

          1. All maps complete, all reducers scheduled and some (or all) started
          2. Some of the reducers, but not all, finish shuffling and proceed to the MERGE or REDUCE phases
          3. Node with some map outputs goes down
          4. Remaining reducers in the SHUFFLE phase or not assigned cannot complete, maps get retroactively failed for fetch failures, need to launch new map attempts
          5. At this point we do not want to kill any reducers past the SHUFFLE phase as they can progress and complete without further interactions for map outputs
          Show
          jlowe Jason Lowe added a comment - Aren't all reducers in SHUFFLE phase until all the mappers are done? No, here's an example scenario: All maps complete, all reducers scheduled and some (or all) started Some of the reducers, but not all, finish shuffling and proceed to the MERGE or REDUCE phases Node with some map outputs goes down Remaining reducers in the SHUFFLE phase or not assigned cannot complete, maps get retroactively failed for fetch failures, need to launch new map attempts At this point we do not want to kill any reducers past the SHUFFLE phase as they can progress and complete without further interactions for map outputs
          Hide
          adhoot Anubhav Dhoot added a comment -

          In the old code we are not preempting if either the headroom or the assigned maps are enough to run a mapper. So the early out is consistent with the old preemption. But the new preemption does not have to have the same conditions.
          Since we are using it as a way to come out of deadlocks, I would think preempting irrespective of how many mappers are running is
          (a) safer and simpler to reason since it is only time based - We do not have to second guess if we are missing some other reasons for deadlock apart from incorrect headroom.
          (b) better in terms of overall throughput for cases as Jason mentioned.
          Having a large timeout is the safety lever for controlling the aggressiveness of the preemption.
          Factoring in slow start in a subsequent jira seems like a good idea to me. I can think of reasons not to factor it in but leave it only as a heuristic to start reducers.

          Show
          adhoot Anubhav Dhoot added a comment - In the old code we are not preempting if either the headroom or the assigned maps are enough to run a mapper. So the early out is consistent with the old preemption. But the new preemption does not have to have the same conditions. Since we are using it as a way to come out of deadlocks, I would think preempting irrespective of how many mappers are running is (a) safer and simpler to reason since it is only time based - We do not have to second guess if we are missing some other reasons for deadlock apart from incorrect headroom. (b) better in terms of overall throughput for cases as Jason mentioned. Having a large timeout is the safety lever for controlling the aggressiveness of the preemption. Factoring in slow start in a subsequent jira seems like a good idea to me. I can think of reasons not to factor it in but leave it only as a heuristic to start reducers.
          Hide
          kasha Karthik Kambatla added a comment -

          Preempting reducers to run mappers doesn't always lead to higher throughput. The reducer being preempted might have to spend more time to re-copy map outputs from every mapper than the mappers in question take to run. I understand that it will likely make sense for the vast majority of cases.

          I propose we do the following:

          1. In this JIRA, let us just fix starvation. Stick to the logic of preempting enough resources to run one mapper.
          2. In a follow up JIRA(s), let us improve this preemption to
            1. preempt reducers until we are able to meet the slowstart threshold
            2. prioritize preempting reducers that are still in SHUFFLE phase as Jason mentioned
            3. add an option to not preempt reducers that are past SHUFFLE phase irrespective of slowstart as long as one mapper can run
          Show
          kasha Karthik Kambatla added a comment - Preempting reducers to run mappers doesn't always lead to higher throughput. The reducer being preempted might have to spend more time to re-copy map outputs from every mapper than the mappers in question take to run. I understand that it will likely make sense for the vast majority of cases. I propose we do the following: In this JIRA, let us just fix starvation. Stick to the logic of preempting enough resources to run one mapper. In a follow up JIRA(s), let us improve this preemption to preempt reducers until we are able to meet the slowstart threshold prioritize preempting reducers that are still in SHUFFLE phase as Jason mentioned add an option to not preempt reducers that are past SHUFFLE phase irrespective of slowstart as long as one mapper can run
          Hide
          jlowe Jason Lowe added a comment -

          Since the old code also doesn't preempt if there's room for one map then I'm OK with the current logic. I just didn't want a regression. And as for SHUFFLE phase awareness, I agree that's best left for a followup JIRA.

          Show
          jlowe Jason Lowe added a comment - Since the old code also doesn't preempt if there's room for one map then I'm OK with the current logic. I just didn't want a regression. And as for SHUFFLE phase awareness, I agree that's best left for a followup JIRA.
          Hide
          kasha Karthik Kambatla added a comment -

          v5 patch addresses Jason's patch-specific comments, and has a little more code reuse.

          Show
          kasha Karthik Kambatla added a comment - v5 patch addresses Jason's patch-specific comments, and has a little more code reuse.
          Hide
          kasha Karthik Kambatla added a comment -

          Filed MAPREDUCE-6501 to track the follow-up work.

          Show
          kasha Karthik Kambatla added a comment - Filed MAPREDUCE-6501 to track the follow-up work.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 31m 18s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 46s There were no new javac warning messages.
          +1 javadoc 10m 51s There were no new javadoc warning messages.
          -1 release audit 0m 15s The applied patch generated 1 release audit warnings.
          -1 checkstyle 2m 45s The applied patch generated 2 new checkstyle issues (total was 517, now 518).
          -1 checkstyle 3m 14s The applied patch generated 1 new checkstyle issues (total was 12, now 12).
          -1 whitespace 0m 3s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 38s mvn install still works.
          +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse.
          +1 findbugs 4m 24s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 mapreduce tests 9m 52s Tests passed in hadoop-mapreduce-client-app.
          +1 mapreduce tests 1m 51s Tests passed in hadoop-mapreduce-client-core.
          +1 yarn tests 59m 53s Tests passed in hadoop-yarn-server-resourcemanager.
              131m 45s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12764974/mr-6302-5.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 30e2f83
          Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/patchReleaseAuditProblems.txt
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
          whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/whitespace.txt
          hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
          hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 31m 18s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 46s There were no new javac warning messages. +1 javadoc 10m 51s There were no new javadoc warning messages. -1 release audit 0m 15s The applied patch generated 1 release audit warnings. -1 checkstyle 2m 45s The applied patch generated 2 new checkstyle issues (total was 517, now 518). -1 checkstyle 3m 14s The applied patch generated 1 new checkstyle issues (total was 12, now 12). -1 whitespace 0m 3s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 38s mvn install still works. +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse. +1 findbugs 4m 24s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 9m 52s Tests passed in hadoop-mapreduce-client-app. +1 mapreduce tests 1m 51s Tests passed in hadoop-mapreduce-client-core. +1 yarn tests 59m 53s Tests passed in hadoop-yarn-server-resourcemanager.     131m 45s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12764974/mr-6302-5.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 30e2f83 Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/patchReleaseAuditProblems.txt checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6055/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Should we make MR_JOB_REDUCER_UNCONDITIONAL_PREEMPT_DELAY_SEC and MR_JOB_REDUCER_PREEMPT_DELAY_SEC consistent in the way they treat negative values?
          Today MR_JOB_REDUCER_PREEMPT_DELAY_SEC treats negative same as zero which does not allow you to turn it off while the new proposed MR_JOB_REDUCER_UNCONDITIONAL_PREEMPT_DELAY_SEC uses negative as a way to turn off preemption. The latter seems preferable and since the default is zero and the doc does not talk about the negative value, i think it should be ok to change this behavior. Thoughts?

          Its better to reword
          // Duration to wait before forcibly preempting a reducer when there is room
          to
          // Duration to wait before forcibly preempting a reducer irrespective of whether there is room

          Show
          adhoot Anubhav Dhoot added a comment - Should we make MR_JOB_REDUCER_UNCONDITIONAL_PREEMPT_DELAY_SEC and MR_JOB_REDUCER_PREEMPT_DELAY_SEC consistent in the way they treat negative values? Today MR_JOB_REDUCER_PREEMPT_DELAY_SEC treats negative same as zero which does not allow you to turn it off while the new proposed MR_JOB_REDUCER_UNCONDITIONAL_PREEMPT_DELAY_SEC uses negative as a way to turn off preemption. The latter seems preferable and since the default is zero and the doc does not talk about the negative value, i think it should be ok to change this behavior. Thoughts? Its better to reword // Duration to wait before forcibly preempting a reducer when there is room to // Duration to wait before forcibly preempting a reducer irrespective of whether there is room
          Hide
          kasha Karthik Kambatla added a comment -

          Should we make MR_JOB_REDUCER_UNCONDITIONAL_PREEMPT_DELAY_SEC and MR_JOB_REDUCER_PREEMPT_DELAY_SEC consistent in the way they treat negative values?

          Good point. May be I should rename the unconditional preemption config "timeout" instead of "delay". MR_JOB_REDUCER_PREEMPT_DELAY_SEC delays the preemption; a positive value leads to waiting until it is done. The config we are adding here is more a timeout: if we don't get resources by this time, we preempt.

          That way, for both configs, a negative value would mean disable.

          Show
          kasha Karthik Kambatla added a comment - Should we make MR_JOB_REDUCER_UNCONDITIONAL_PREEMPT_DELAY_SEC and MR_JOB_REDUCER_PREEMPT_DELAY_SEC consistent in the way they treat negative values? Good point. May be I should rename the unconditional preemption config "timeout" instead of "delay". MR_JOB_REDUCER_PREEMPT_DELAY_SEC delays the preemption; a positive value leads to waiting until it is done. The config we are adding here is more a timeout: if we don't get resources by this time, we preempt. That way, for both configs, a negative value would mean disable.
          Hide
          kasha Karthik Kambatla added a comment -

          May be I should rename the unconditional preemption config "timeout" instead of "delay".

          I tried this out, but it doesn't read well. Sorry for vacillating on this, long day. Given the slight difference in nature of the two configs, I am fine with leaving the negative values inconsistent. I am willing to change the config name to something that better describes this. Suggestions are very welcome.

          Show
          kasha Karthik Kambatla added a comment - May be I should rename the unconditional preemption config "timeout" instead of "delay". I tried this out, but it doesn't read well. Sorry for vacillating on this, long day. Given the slight difference in nature of the two configs, I am fine with leaving the negative values inconsistent. I am willing to change the config name to something that better describes this. Suggestions are very welcome.
          Hide
          kasha Karthik Kambatla added a comment -

          Patch that fixes a checkstyle issue and addresses the comment reword Anubhav suggested.

          Show
          kasha Karthik Kambatla added a comment - Patch that fixes a checkstyle issue and addresses the comment reword Anubhav suggested.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 19m 15s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 56s There were no new javac warning messages.
          +1 javadoc 10m 19s There were no new javadoc warning messages.
          -1 release audit 0m 15s The applied patch generated 1 release audit warnings.
          -1 checkstyle 1m 34s The applied patch generated 2 new checkstyle issues (total was 517, now 518).
          -1 whitespace 0m 3s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 32s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 4m 1s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 mapreduce tests 9m 42s Tests passed in hadoop-mapreduce-client-app.
          +1 mapreduce tests 1m 49s Tests passed in hadoop-mapreduce-client-core.
          +1 yarn tests 56m 13s Tests passed in hadoop-yarn-server-resourcemanager.
              113m 40s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12765298/mr-6302-6.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 1bca1bb
          Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/patchReleaseAuditProblems.txt
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt
          whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/whitespace.txt
          hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
          hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 19m 15s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 56s There were no new javac warning messages. +1 javadoc 10m 19s There were no new javadoc warning messages. -1 release audit 0m 15s The applied patch generated 1 release audit warnings. -1 checkstyle 1m 34s The applied patch generated 2 new checkstyle issues (total was 517, now 518). -1 whitespace 0m 3s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 32s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 4m 1s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 9m 42s Tests passed in hadoop-mapreduce-client-app. +1 mapreduce tests 1m 49s Tests passed in hadoop-mapreduce-client-core. +1 yarn tests 56m 13s Tests passed in hadoop-yarn-server-resourcemanager.     113m 40s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12765298/mr-6302-6.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 1bca1bb Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/patchReleaseAuditProblems.txt checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6061/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Looks like the whitespace error is genuine while the checkstyle and release audit can be ignored.

          MR_JOB_REDUCER_PREEMPT_DELAY_SEC delays the preemption; a positive value leads to waiting until it is done. The config we are adding here is more a timeout: if we don't get resources by this time, we preempt.

          Aren't both values waiting to get resources by the configured time and will not do any preemption if it gets resources by then? Even MR_JOB_REDUCER_PREEMPT_DELAY_SEC will not preempt if the resources were obtained before the timeout. I am concerned we are introducing an inconsistency in this patch that will burden administrators. It would be good to at least update the doc comments in yarn-default to indicate the effect of negative values and zero for both configs.

          Show
          adhoot Anubhav Dhoot added a comment - Looks like the whitespace error is genuine while the checkstyle and release audit can be ignored. MR_JOB_REDUCER_PREEMPT_DELAY_SEC delays the preemption; a positive value leads to waiting until it is done. The config we are adding here is more a timeout: if we don't get resources by this time, we preempt. Aren't both values waiting to get resources by the configured time and will not do any preemption if it gets resources by then? Even MR_JOB_REDUCER_PREEMPT_DELAY_SEC will not preempt if the resources were obtained before the timeout. I am concerned we are introducing an inconsistency in this patch that will burden administrators. It would be good to at least update the doc comments in yarn-default to indicate the effect of negative values and zero for both configs.
          Hide
          kasha Karthik Kambatla added a comment -

          I see your point.

          I agree that it would be nice for the two configs to be consistent; -1 could mean disable the feature for both. Unfortunately one of the configs has already been in a release and changes to that would be backwards incompatible.

          Now, I see the following alternatives for the new config

          1. what the latest patch does. -1 for disable, >=0 is the wait time before preemption
          2. the value provided is the wait time before preemption, and any negative values are interpreted as zero. If folks want to disable this, they will have to pass Long.MAX_VALUE.

          Personally, I find the first one simpler to use. I do see the inconsistency between two very similar but different configs.

          Show
          kasha Karthik Kambatla added a comment - I see your point. I agree that it would be nice for the two configs to be consistent; -1 could mean disable the feature for both. Unfortunately one of the configs has already been in a release and changes to that would be backwards incompatible. Now, I see the following alternatives for the new config what the latest patch does. -1 for disable, >=0 is the wait time before preemption the value provided is the wait time before preemption, and any negative values are interpreted as zero. If folks want to disable this, they will have to pass Long.MAX_VALUE. Personally, I find the first one simpler to use. I do see the inconsistency between two very similar but different configs.
          Hide
          kasha Karthik Kambatla added a comment -

          Discussed with Anubhav offline. We decided to update the description for the old config here, and file a follow-up JIRA to make them consistent on trunk.

          Show
          kasha Karthik Kambatla added a comment - Discussed with Anubhav offline. We decided to update the description for the old config here, and file a follow-up JIRA to make them consistent on trunk.
          Hide
          kasha Karthik Kambatla added a comment -
          Show
          kasha Karthik Kambatla added a comment - Filed MAPREDUCE-6506
          Hide
          adhoot Anubhav Dhoot added a comment -

          +1

          Show
          adhoot Anubhav Dhoot added a comment - +1
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 19m 23s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 8m 3s There were no new javac warning messages.
          +1 javadoc 10m 20s There were no new javadoc warning messages.
          -1 release audit 0m 20s The applied patch generated 1 release audit warnings.
          -1 checkstyle 1m 33s The applied patch generated 2 new checkstyle issues (total was 517, now 518).
          -1 whitespace 0m 2s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 33s mvn install still works.
          +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse.
          +1 findbugs 4m 5s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 mapreduce tests 9m 43s Tests passed in hadoop-mapreduce-client-app.
          +1 mapreduce tests 1m 48s Tests passed in hadoop-mapreduce-client-core.
          +1 yarn tests 56m 20s Tests passed in hadoop-yarn-server-resourcemanager.
              114m 15s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12765709/mr-6302-7.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 8d22622
          Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/patchReleaseAuditProblems.txt
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt
          whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/whitespace.txt
          hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
          hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 19m 23s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 3s There were no new javac warning messages. +1 javadoc 10m 20s There were no new javadoc warning messages. -1 release audit 0m 20s The applied patch generated 1 release audit warnings. -1 checkstyle 1m 33s The applied patch generated 2 new checkstyle issues (total was 517, now 518). -1 whitespace 0m 2s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 33s mvn install still works. +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse. +1 findbugs 4m 5s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 9m 43s Tests passed in hadoop-mapreduce-client-app. +1 mapreduce tests 1m 48s Tests passed in hadoop-mapreduce-client-core. +1 yarn tests 56m 20s Tests passed in hadoop-yarn-server-resourcemanager.     114m 15s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12765709/mr-6302-7.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 8d22622 Release Audit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/patchReleaseAuditProblems.txt checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6063/console This message was automatically generated.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8600 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8600/)
          MAPREDUCE-6302. Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8600 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8600/ ) MAPREDUCE-6302 . Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          Hide
          kasha Karthik Kambatla added a comment - - edited

          Just committed to trunk and branch-2. branch-2 had a conflict, attached is the submitted patch. Verified the trunk and branch-2 patches don't differ in code.

          Thanks everyone for providing inputs, Jason and Anubhav for your reviews. Glad to resolve this long-standing issue.

          Show
          kasha Karthik Kambatla added a comment - - edited Just committed to trunk and branch-2. branch-2 had a conflict, attached is the submitted patch. Verified the trunk and branch-2 patches don't differ in code. Thanks everyone for providing inputs, Jason and Anubhav for your reviews. Glad to resolve this long-standing issue.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #514 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/514/)
          MAPREDUCE-6302. Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #514 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/514/ ) MAPREDUCE-6302 . Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2414 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2414/)
          MAPREDUCE-6302. Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f)

          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2414 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2414/ ) MAPREDUCE-6302 . Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f) hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2448 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2448/)
          MAPREDUCE-6302. Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2448 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2448/ ) MAPREDUCE-6302 . Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1241 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1241/)
          MAPREDUCE-6302. Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1241 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1241/ ) MAPREDUCE-6302 . Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #476 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/476/)
          MAPREDUCE-6302. Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #476 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/476/ ) MAPREDUCE-6302 . Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #504 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/504/)
          MAPREDUCE-6302. Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #504 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/504/ ) MAPREDUCE-6302 . Incorrect headroom can lead to a deadlock between map (kasha: rev 4aa9b3e75ca86917125e56e1b438668273a5d87f) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          Hide
          sjlee0 Sangjin Lee added a comment -

          Karthik Kambatla, would this be a good candidate for 2.6.2 and 2.7.2? If so, do you mind committing this to branch-2.6 and branch-2.7 also?

          Show
          sjlee0 Sangjin Lee added a comment - Karthik Kambatla , would this be a good candidate for 2.6.2 and 2.7.2? If so, do you mind committing this to branch-2.6 and branch-2.7 also?
          Hide
          kasha Karthik Kambatla added a comment -

          I wasn't sure about that. Personally, I would like to let it bake through some tests before pulling it into maintenance releases.

          Show
          kasha Karthik Kambatla added a comment - I wasn't sure about that. Personally, I would like to let it bake through some tests before pulling it into maintenance releases.
          Hide
          sjlee0 Sangjin Lee added a comment -

          Sure. I'm certain there will be subsequence releases in the 2.6.x and 2.7.x lines.

          Show
          sjlee0 Sangjin Lee added a comment - Sure. I'm certain there will be subsequence releases in the 2.6.x and 2.7.x lines.
          Hide
          Tagar Ruslan Dautkhanov added a comment -

          It would be great to have this backported to 2.6.. We saw so many times a single hive job can self-deadlock because of this problem. Cloudera Support pointed to MAPREDUCE-6302. Thanks!

          Show
          Tagar Ruslan Dautkhanov added a comment - It would be great to have this backported to 2.6.. We saw so many times a single hive job can self-deadlock because of this problem. Cloudera Support pointed to MAPREDUCE-6302 . Thanks!
          Hide
          leftnoteasy Wangda Tan added a comment -

          +1 to backport this issue to 2.6.x and 2.7.x

          Show
          leftnoteasy Wangda Tan added a comment - +1 to backport this issue to 2.6.x and 2.7.x
          Hide
          jooseong Jooseong Kim added a comment -

          I think this usually happens when the RM sends out a overestimated headroom.
          One thing we could do is to skip scheduleReduces() if we ended up preempting reducers through preemptReducesIfNeeded().
          Since the headroom is underestimated, scheduleReduces may schedule more reducers, which will need to be preempted again.

          Show
          jooseong Jooseong Kim added a comment - I think this usually happens when the RM sends out a overestimated headroom. One thing we could do is to skip scheduleReduces() if we ended up preempting reducers through preemptReducesIfNeeded(). Since the headroom is underestimated, scheduleReduces may schedule more reducers, which will need to be preempted again.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Attached patch for branch-2.6/branch-2.7 for review, and added 2.6.3/2.7.3 to target versions .

          Show
          leftnoteasy Wangda Tan added a comment - Attached patch for branch-2.6/branch-2.7 for review, and added 2.6.3/2.7.3 to target versions .
          Hide
          kasha Karthik Kambatla added a comment -

          Thanks for posting the patches, Wangda. Looks like the updated patches seem to do what we do on branch-2. +1

          Show
          kasha Karthik Kambatla added a comment - Thanks for posting the patches, Wangda. Looks like the updated patches seem to do what we do on branch-2. +1
          Hide
          Tagar Ruslan Dautkhanov added a comment -

          Yep, +1 for the backport.

          Btw, we found that increasing mapreduce.job.reduce.slowstart.completedmaps to 0.9 (from default 0.8) decreases chances for this bug to show up.

          Show
          Tagar Ruslan Dautkhanov added a comment - Yep, +1 for the backport. Btw, we found that increasing mapreduce.job.reduce.slowstart.completedmaps to 0.9 (from default 0.8) decreases chances for this bug to show up.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Update:
          We run jobs in test cluster with this fix for branch-2.6/branch-2.7 for a few days, didn't see deadlock issue come back again. Will back port patches to branch-2.6/branch-2.7 in a few days if no opposite opinions.

          Show
          leftnoteasy Wangda Tan added a comment - Update: We run jobs in test cluster with this fix for branch-2.6/branch-2.7 for a few days, didn't see deadlock issue come back again. Will back port patches to branch-2.6/branch-2.7 in a few days if no opposite opinions.
          Hide
          Tagar Ruslan Dautkhanov added a comment -

          That would be great.

          Show
          Tagar Ruslan Dautkhanov added a comment - That would be great.
          Hide
          memoryz Jason Wang added a comment -

          was it backported?

          Show
          memoryz Jason Wang added a comment - was it backported?
          Hide
          leftnoteasy Wangda Tan added a comment -

          Apologize I forgot backporting patches to maintenance releases. Doing it now.

          Show
          leftnoteasy Wangda Tan added a comment - Apologize I forgot backporting patches to maintenance releases. Doing it now.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Done committed to branch-2.6/2.7

          Show
          leftnoteasy Wangda Tan added a comment - Done committed to branch-2.6/2.7
          Hide
          memoryz Jason Wang added a comment -

          Thanks Wangda!

          Show
          memoryz Jason Wang added a comment - Thanks Wangda!
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.

            People

            • Assignee:
              kasha Karthik Kambatla
              Reporter:
              shurong.mai mai shurong
            • Votes:
              0 Vote for this issue
              Watchers:
              42 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development