Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10868

Flink's JobCluster ResourceManager doesn't use maximum-failed-containers as limit of resource acquirement

    XMLWordPrintableJSON

    Details

      Description

      Currently, YarnResourceManager does use yarn.maximum-failed-containers as limit of resource acquirement. In worse case, when new start containers consistently fail, YarnResourceManager will goes into an infinite resource acquirement process without failing the job. Together with the https://issues.apache.org/jira/browse/FLINK-10848, It will quick occupy all resources of yarn queue.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hpeter Zhenqiu Huang
                Reporter:
                ZhenqiuHuang Zhenqiu Huang
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h