Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-28473

JobManager restart/failover doesn't trigger local recovery on TaskManagers

    XMLWordPrintableJSON

Details

    Description

      Hi! While experimenting with local recovery feature (Flink 1.15.1) I noticed that if JobManager is restarted TaskManagers always recover from Remote (IncrementalRemoteKeyedStateHandle). While if I restart task managers, local recovery is triggered.
       
      Setup: * HA setup with Zookeeper and S3 remote storage.

      • JobManager runs as StatefulSet with PersistentVolume. Both process.jobmanager.working-dir and jobmanager.resource-id are correctly configured.
      • TaskManagers run as StatefulSets with PersistentVolume. Both process.taskmanager.working-dir and taskmanager.resource-id are correctly configured.

      Attachments

        Activity

          People

            Unassigned Unassigned
            lkokhreidze Levani Kokhreidze
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: