Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25963 FLIP-212: Introduce Flink Kubernetes Operator
  3. FLINK-26577

Avoid state loss when switching to last-state upgrade mode

    XMLWordPrintableJSON

Details

    Description

      At the moment there are several corner cases which can lead to accidental state loss (or at least weird behaviour) when switching to last-state upgrade mode from other modes.

      2 cases that immediately come to mind:

      savepoint to last-state: 
      When the new upgrade mode is last-state, the job deployment will simply be deleted. If HA was not enabled previously, the last savepoint might be very far back in time.

      stateless to last-state:
      If checkpointing and HA is not enabled, the deployment will simply be killed like previously and we might start a job from empty state. Maybe taking a savepoint would be the right approach in this case and continue from there.

      Maybe when switching between modes we should consider the previous mode as well as the target mode when deciding the on the suspend strategy. We could also simply not allow to switch to last-state if HA is not enabled previously but that might be too restrictive.

      Attachments

        Issue Links

          Activity

            People

              wangyang0918 Yang Wang
              gyfora Gyula Fora
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: