Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-28843

Fail to find incremental handle when restoring from changelog checkpoint in claim mode

    XMLWordPrintableJSON

Details

    Description

      1. When native checkpoint is enabled and incremental checkpointing is enabled in rocksdb statebackend,if state data is greater than state.storage.fs.memory-threshold,it will be stored in a data file (FileStateHandle,RelativeFileStateHandle, etc) rather than stored with ByteStreamStateHandle in checkpoint metadata, like base-path1/chk-1/file1.
      2. Then restore the job from base-path1/chk-1 in claim mode,using changelog statebackend,and the checkpoint path is set to base-path2, then new checkpoint will be saved in base-path2/chk-2, previous checkpoint file (base-path1/chk-1/file1) is needed.
      3. Then restore the job from base-path2/chk-2 in changelog statebackend, flink will try to read base-path2/chk-2/file1, rather than the actual file location base-path1/chk-1/file1, which leads to FileNotFoundException and job failed.

       
      How to reproduce?

      1. Set state.storage.fs.memory-threshold to a small value, like '20b'.
      2. run org.apache.flink.test.checkpointing.ChangelogPeriodicMaterializationSwitchStateBackendITCase#testSwitchFromDisablingToEnablingInClaimMode

      Attachments

        Issue Links

          Activity

            People

              frozen stone Lihe Ma
              frozen stone Lihe Ma
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: