Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-26388

Release Testing: Repeatable Cleanup (FLINK-25433)

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Repeatable cleanup got introduced with FLIP-194 but should be considered as an independent feature of the JobResultStore (JRS) from a user's point of view.

      Repeatable cleanup can be triggered by running into an error while cleaning up. This can be achieved by disabling access to S3 after the job finished, e.g.:

      • Setting a reasonable enough checkpointing time (checkpointing should be enabled to allow cleanup of s3)
      • Disable s3 (removing permissions or shutting down the s3 server)
      • Stop job with savepoint

      Stopping the job should work but the logs should show failure with repeating retries. Enabling S3 again should fix the issue.

      Keep in mind that if testing this in with HA, you should use a different bucket for the file-based JRS artifacts only change permissions for the bucket that holds JRS-unrelated artifacts. Flink would fail fatally if the JRS is not able to access it's backend storage.

      Documentation and configuration is still in the process of being updated in FLINK-26296 and FLINK-26331

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            dwysakowicz Dawid Wysakowicz
            mapohl Matthias Pohl
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment