Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25817 FLIP-201: Persist local state in working directory
  3. FLINK-25849

Differentiate TaskManager sessions when registering at the JobMaster

    XMLWordPrintableJSON

Details

    Description

      With the introduction of configurable ResourceID for TaskManager processes, it can happen that a restarted TaskManager process will be restarted with the same ResourceID. When it now tries to register at the JobMaster, the JobMaster won't recognize it as a new instance because it only compares the ResourceID. As a consequence, the JobMaster things that this is a duplicate registration and ignores it.

      It would be better if the TaskManager would send a session id with the registration that could then be used to decide whether a new instance tries to register at the JobMaster and, therefore, the old one needs to be disconnected or whether the registration attempt is a duplicate. This would speed up the cluster reconciliation because we would not have to wait for the heartbeat timeout to occur.

      Attachments

        Issue Links

          Activity

            People

              trohrmann Till Rohrmann
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: