Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-21053

Prevent potential RejectedExecutionExceptions in CheckpointCoordinator failing JM

    XMLWordPrintableJSON

Details

    Description

      In the past, there were multiple bugs caused by throwing/handling RejectedExecutionException in CheckpointCoordinator (FLINK-18290FLINK-20992).

       

      And I think it's still possible as there are many places where an executor is passed to calls to CompletableFuture.xxxAsync while it can already be shut down.

       

      In FLINK-20992 we discussed two approaches to fix this.

      One approach is to check executor state inside a synchronized block every time when it is used.

      Second approach is to

      1. Create executors inside CheckpointCoordinator (both io & timer thread pools)
      2. Check isShutdown() in their RejectedExecution handlers (if yes and it's RejectedExecutionException then just log; otherwise delegate to FatalExitExceptionHandler)
      3. (this will allow to remove such RejectedExecutionException checks from coordinator code)

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              roman Roman Khachatryan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: