Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5960

Make CheckpointCoordinator less blocking

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.2.0, 1.3.0, 1.9.0, 1.10.0
    • None

    Description

      Currently the CheckpointCoordinator locks its operation under a global lock. This also includes writing checkpoint data out to a state storage. If this operation blocks, then the whole checkpoint operator stands still. I think we should rework the CheckpointCoordinator to make fewer assumptions about external systems to tolerate write failures and timeouts. Furthermore, we should try to limit the scope of locking and the execution of potentially blocking operation under the lock. This will improve the runtime behaviour of the CheckpointCoordinator.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: