XMLWordPrintableJSON

Details

    Description

      For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers from previous attempt should register to the new JM. Depending on the order that slot requests and TM registrations arrive at the RM, it could happen that RM allocates unnecessary new resources while there are recovered resources that can be reused.

      A potential improvement is to add recovered workers to pending resources, so that RM knows what resources are expected to be available soon and decide whether to allocate new resources accordingly.

      See also the discussion in FLINK-20249.

      Attachments

        Issue Links

          Activity

            People

              xtsong Xintong Song
              xtsong Xintong Song
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: