Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8533

Support MasterTriggerRestoreHook state reinitialization

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.3.0
    • 1.5.0
    • None

    Description

      MasterTriggerRestoreHook enables coordination with an external system for taking or restoring checkpoints. When execution is restarted from a checkpoint, restoreCheckpoint is called to restore or reinitialize the external system state. There's an edge case where the external state is not adequately reinitialized, that is when execution fails before the first checkpoint. In that case, the hook is not invoked and has no opportunity to restore the external state to initial conditions.

      The impact is a loss of exactly-once semantics in this case. For example, in the Pravega source function, the reader group state (e.g. stream position data) is stored externally. In the normal restore case, the reader group state is forcibly rewound to the checkpointed position. In the edge case where no checkpoint has yet been successful, the reader group state is not rewound and consequently some amount of stream data is not reprocessed.

      A possible fix would be to introduce an initializeState method on the hook interface. Similar to CheckpointedFunction::initializeState, this method would be invoked unconditionally upon hook initialization. The Pravega hook would, for example, initialize or forcibly reinitialize the reader group state.    

      Attachments

        Issue Links

          Activity

            People

              eronwright Eron Wright
              eronwright Eron Wright
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: