Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-27687

Flink shouldn't assume temp folders keep existing when unused

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.14.4
    • None
    • Runtime / Network
    • None

    Description

      In SpanningWrapper.createSpillingChannel, it assumes that the folder in which we create the file exists. However, this is not the case in the following scenario (which actually happened to us today):

      • The temp folders were created a while ago (I assume on startup of the task-manager) in the /tmp folder. They weren't used for a while, probably because we didn't have any record big enough to trigger it.
      • The cleanup cron for /tmp did its job and deleted those old folders in /tmp.
      • We deployed a new version of the job that actually needed the folders, and it crashed.

      => Not sure if it should be SpanningWrapper's responsability to create the folder if it doesn't exist anymore, though, but I'm not familiar enough with Flink's internal to make a guess as to what class should do it. The problem occurred to us on SpanningWrapper, but it can probably happen in other places as well.

      More generally, assuming that folders and files in /tmp won't get deleted at some point doesn't seem correct to me. The documentation for io.tmp.dirs recommands that it shouldn't be purged, but we do need to clean up at some point. If that is not the case, then the documentation should be updated to indicate that this is not a recommendation but mandatory, and that purges will break the jobs (not just trigger a recovery).

      Attachments

        Activity

          People

            Unassigned Unassigned
            gael Gaƫl Renoux
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: