Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-14434

Dispatcher#createJobManagerRunner should not start JobManagerRunner

    XMLWordPrintableJSON

Details

    Description

      In an edge case, let's said

      1) job finished nearly immediately
      2) Dispatcher has been suspended in #startJobManagerRunner after jobManagerRunner.start(); but before return jobManagerRunner;

      due to

      1) we put jobManagerRunnerFutures with #startJobManagerRunner finished.
      2) the creation of JobManagerRunner doesn't happen in MainThread.

      it is a possible execution order

      1) JobManagerRunner created in akka-dispatcher thread
      2) then apply Dispatcher#startJobManagerRunner
      3) until jobManagerRunner.start(); and before return jobManagerRunner;
      4) this thread suspended
      5) job finished, execute callback on MainThread
      6) jobManagerRunnerFutures.get(jobID).getNow(null) returns null because akka-dispatcher thread doesn't return jobManagerRunner;
      7) it report There is a newer JobManagerRunner for the job but actually not.

      *Solution*

      Two perspective but we can even have them both.

      1. return jobManagerRunnerFuture in #createJobManagerRunner, let #startJobManagerRunner an action
      2. on JobManagerRunner created, execute #startJobManagerRunner in MainThread.

      CC trohrmann

      Attachments

        1. patch.diff
          2 kB
          Zili Chen

        Issue Links

          Activity

            People

              tison Zili Chen
              tison Zili Chen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m