Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-14979

TwoPhaseCommitSink fails when checkpoint overtakes a savepoint

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.6.4, 1.7.2, 1.8.2, 1.9.1
    • None
    • None

    Description

      As reported by a user on the user mailing list, TwoPhaseCommitSinkFunction#notifyCheckpointComplete can fail with the following exception:

      java.lang.RuntimeException: Error while confirming checkpoint
          at org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1205)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.IllegalStateException: checkpoint completed, but no transaction pending
          at org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
          at org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.notifyCheckpointComplete(TwoPhaseCommitSinkFunction.java:267)
          at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.notifyCheckpointComplete(AbstractUdfStreamOperator.java:130)
          at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:822)
          at org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1200)
          ... 5 more
      

      This can happen in the following scenario:

      1. savepoint is triggered
      2. checkpoint is triggered
      3. checkpoint completes (but it doesn't subsume the savepoint, because checkpoints subsume only other checkpoints).
      4. savepoint completes

      In this case, TwoPhaseCommitSinkFunction receives first notification that the later checkpoint completed, it commits both savepoint and the checkpoint. Later when savepoint notifyCheckpointComplete arrives, the above error will occur.

      Possible trivial fix is to remove that failing checkState.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pnowojski Piotr Nowojski
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: