Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-28614

Empty local state folders not cleanup on retrieving local state

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Resolved
    • 1.15.0, 1.15.1, 1.16.0
    • 1.16.0
    • Runtime / Coordination
    • None

    Description

      It would create a checkpoint directory when trying to load TaskStateSnapshot from the disk. The local checkpoint directory is not deleted on exit tryLoadTaskStateSnapshotFromDisk() }}even though {{TaskStateSnapshot doesn't exist. 

       

      File getTaskStateSnapshotFile(long checkpointId) {
          final File checkpointDirectory =
                  localRecoveryConfig
                          .getLocalStateDirectoryProvider()
                          .orElseThrow(
                                  () -> new IllegalStateException("Local recovery must be enabled."))
                          .subtaskSpecificCheckpointDirectory(checkpointId);
      
          if (!checkpointDirectory.exists() && !checkpointDirectory.mkdirs()) {
              throw new FlinkRuntimeException(
                      String.format(
                              "Could not create the checkpoint directory '%s'", checkpointDirectory));
          }
      
          return new File(checkpointDirectory, TASK_STATE_SNAPSHOT_FILENAME);
      } 

       

       

      This will cause the folder in /localState to remain after failover. Here is an example: 

      41854 [flink-akka.actor.default-dispatcher-8] INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Restoring job 35644df535ca04613d6a6116dcfcfd59 from Checkpoint 2 @ 1658292943408 for 35644df535ca04613d6a6116dcfcfd59 located at file:/var/folders/4n/q3r37vws2f910rt_f469kwg00000gn/T/junit1426665332205293555/junit63847204117629783/35644df535ca04613d6a6116dcfcfd59/chk-2.
      
      _______________________________________
      directory of localState
      _______________________________________ 
      tm_2
          │   ├── blobStorage
          │   ├── localState
          │   │   └── aid_6df21e53ca06ea69ee0643d25d27dbee
          │   │       └── jid_35644df535ca04613d6a6116dcfcfd59
          │   │           └── vtx_0a448493b4782967b150582570326227_sti_1
          │   │               ├── chk_2
          │   │               └── chk_5
          │   │                   ├── _task_state_snapshot
          │   │                   ├── edab98058083464a9ca29b6d7a950c68
          │   │                   │   ├── 000014.sst
          │   │                   │   ├── 000015.sst
          │   │                   │   ├── 000022.sst
          │   │                   │   ├── 000023.sst
          │   │                   │   ├── CURRENT
          │   │                   │   ├── MANIFEST-000018
          │   │                   │   └── OPTIONS-000021
          │   │                   └── f3724ae6-fd24-4e9a-80a8-02aa34bca0f0 

      cc: trohrmann , masteryhx 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Yanfei Lei Yanfei Lei
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: