Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5471

Properly inform JobClientActor about terminated Mesos framework

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Won't Do
    • 1.2.0, 1.3.0
    • None
    • Deployment / Mesos
    • None

    Description

      In case that the Mesos framework running Flink terminates (e.g. exceeded number of container restarts) the JobClientActor is not properly informed. As a consequence, the client only terminates after the JobClientActor detects that it lost the connection to the JobManager (JobClientActorConnectionTimeoutException). The current default value for the timeout is 60s which is quite long to detect the connection loss in case of a termination.

      I think it would be better to notify the JobClientActor which allows it to print a better message for the user and also allows it to react quicker.

      Attachments

        Activity

          People

            Unassigned Unassigned
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: