Details

      Description

      The checkpoint coordinator is only created if a checkpointing interval is configured. This means that no savepoints can be triggered if there is no checkpointing interval specified.

      Instead we should always create it and allow an interval of 0 for disabled periodic checkpoints.

        Issue Links

          Activity

          Hide
          uce Ufuk Celebi added a comment -

          Fixed in b420750 (release-1.1).

          Show
          uce Ufuk Celebi added a comment - Fixed in b420750 (release-1.1).
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/2453

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2453
          Hide
          uce Ufuk Celebi added a comment -

          Fixed in 398bd9b (master).

          Show
          uce Ufuk Celebi added a comment - Fixed in 398bd9b (master).
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user uce commented on the issue:

          https://github.com/apache/flink/pull/2453

          Thanks for the contribution. I'm going to merge this with slight modifications. We don't need the introduced checks with the current master as there is a check whether periodic checkpoints are activated.

          Show
          githubbot ASF GitHub Bot added a comment - Github user uce commented on the issue: https://github.com/apache/flink/pull/2453 Thanks for the contribution. I'm going to merge this with slight modifications. We don't need the introduced checks with the current master as there is a check whether periodic checkpoints are activated.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StefanRRichter commented on the issue:

          https://github.com/apache/flink/pull/2453

          Looks good to me now.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/2453 Looks good to me now.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user wuchong commented on the issue:

          https://github.com/apache/flink/pull/2453

          Hi @StefanRRichter , I fixed the conflicts, and add interval check in `triggerCheckpoint`. Could you have a look again ?

          Show
          githubbot ASF GitHub Bot added a comment - Github user wuchong commented on the issue: https://github.com/apache/flink/pull/2453 Hi @StefanRRichter , I fixed the conflicts, and add interval check in `triggerCheckpoint`. Could you have a look again ?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StefanRRichter commented on the issue:

          https://github.com/apache/flink/pull/2453

          Thanks for the nice contribution.

          I see one minor detail that could cause problems. Even if we do not register the `CheckpointCoordinator` as `JobStatusListener`, it might still be possible that `triggerCheckpoint(long timestamp, CheckpointProperties props)` gets called. In this case it can happen that a periodic trigger is registered. You could introduce another check on the interval to make sure that no periodic trigger can be registered when the CheckpointCoordinator is disabled.

          Besides that, the changes look good to me.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/2453 Thanks for the nice contribution. I see one minor detail that could cause problems. Even if we do not register the `CheckpointCoordinator` as `JobStatusListener`, it might still be possible that `triggerCheckpoint(long timestamp, CheckpointProperties props)` gets called. In this case it can happen that a periodic trigger is registered. You could introduce another check on the interval to make sure that no periodic trigger can be registered when the CheckpointCoordinator is disabled. Besides that, the changes look good to me.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/2453

          Will have a look at this again soon.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2453 Will have a look at this again soon.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user wuchong commented on the issue:

          https://github.com/apache/flink/pull/2453

          Hi @StephanEwen , I have updated this PR according to your suggestion. In addition, I use an interval of `Long.MAX_VALUE` for disabled periodic checkpoint. In this way, we will not break interval sanity check and no changes will be attached to `CheckpointCoordinator` and `JobSnapshottingSettings`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user wuchong commented on the issue: https://github.com/apache/flink/pull/2453 Hi @StephanEwen , I have updated this PR according to your suggestion. In addition, I use an interval of `Long.MAX_VALUE` for disabled periodic checkpoint. In this way, we will not break interval sanity check and no changes will be attached to `CheckpointCoordinator` and `JobSnapshottingSettings`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user wuchong commented on the issue:

          https://github.com/apache/flink/pull/2453

          Make sense. I will update this PR.

          Show
          githubbot ASF GitHub Bot added a comment - Github user wuchong commented on the issue: https://github.com/apache/flink/pull/2453 Make sense. I will update this PR.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/2453

          Thanks for looking into this.

          I wonder if we can do this simpler, without changes to the CheckpointCoordinator.
          The only thing that really needs to change is that without periodic checkpoints, the `startCheckpointScheduler()` method is not called, which means that the JobStatusListener that "CheckpointActivatorDeactivator" should not be created and started.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2453 Thanks for looking into this. I wonder if we can do this simpler, without changes to the CheckpointCoordinator. The only thing that really needs to change is that without periodic checkpoints, the `startCheckpointScheduler()` method is not called, which means that the JobStatusListener that "CheckpointActivatorDeactivator" should not be created and started.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user wuchong opened a pull request:

          https://github.com/apache/flink/pull/2453

          FLINK-4510 [checkpoint] Always create CheckpointCoordinator

          Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
          If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
          In addition to going through the list, please provide a meaningful description of your changes.

          • [x] General
          • The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
          • The pull request addresses only one issue
          • Each commit in the PR has a meaningful commit message (including the JIRA id)
          • [ ] Documentation
          • Documentation has been added for new functionality
          • Old documentation affected by the pull request has been updated
          • JavaDoc for public methods has been added
          • [x] Tests & Build
          • Functionality added by the pull request is covered by tests
          • `mvn clean verify` has been executed successfully locally or a Travis build has passed

          This pull request always create checkpoint coordinator even if checkpointing interval is not configured.

          I use a flag to indicate disabled periodic checkpoint, Instead of using an interval of 0. Because the interval is checked everywhere, too scared to change it...

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/wuchong/flink FLINK-4510-checkpoint

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/2453.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #2453


          commit 7ba16434e3d450e0e1e9a8176a1c575945400bcd
          Author: Jark Wu <wuchong.wc@alibaba-inc.com>
          Date: 2016-09-01T08:15:33Z

          FLINK-4510 [checkpoint] Always create CheckpointCoordinator


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user wuchong opened a pull request: https://github.com/apache/flink/pull/2453 FLINK-4510 [checkpoint] Always create CheckpointCoordinator Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide] ( http://flink.apache.org/how-to-contribute.html ). In addition to going through the list, please provide a meaningful description of your changes. [x] General The pull request references the related JIRA issue (" [FLINK-XXX] Jira title text") The pull request addresses only one issue Each commit in the PR has a meaningful commit message (including the JIRA id) [ ] Documentation Documentation has been added for new functionality Old documentation affected by the pull request has been updated JavaDoc for public methods has been added [x] Tests & Build Functionality added by the pull request is covered by tests `mvn clean verify` has been executed successfully locally or a Travis build has passed This pull request always create checkpoint coordinator even if checkpointing interval is not configured. I use a flag to indicate disabled periodic checkpoint, Instead of using an interval of 0. Because the interval is checked everywhere, too scared to change it... You can merge this pull request into a Git repository by running: $ git pull https://github.com/wuchong/flink FLINK-4510 -checkpoint Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2453.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2453 commit 7ba16434e3d450e0e1e9a8176a1c575945400bcd Author: Jark Wu <wuchong.wc@alibaba-inc.com> Date: 2016-09-01T08:15:33Z FLINK-4510 [checkpoint] Always create CheckpointCoordinator
          Hide
          StephanEwen Stephan Ewen added a comment -

          Jark Wu Thanks for helping out here. I assigned the issue to you.

          Show
          StephanEwen Stephan Ewen added a comment - Jark Wu Thanks for helping out here. I assigned the issue to you.
          Hide
          jark Jark Wu added a comment -

          Hi Ufuk Celebi , I would like to contribute this issue.

          Show
          jark Jark Wu added a comment - Hi Ufuk Celebi , I would like to contribute this issue.

            People

            • Assignee:
              jark Jark Wu
              Reporter:
              uce Ufuk Celebi
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development