Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-4050

Change task reconciliation not omit unknown tasks

    XMLWordPrintableJSON

    Details

      Description

      If the master fails over and a framework tries to do an explicit reconciliation for a task running on an agent that has not reregistered yet (and agent_reregister_timeout has not been exceeded), the master will not send a reconciliation response for that task.

      This is confusing for framework authors. It seems better for the master to announce all the information it has explicitly: e.g., to return "task X is in an unknown state", rather than not returning anything. Then as more information arrives (e.g., agent reregisters or task definitively dies), task state would transition appropriately. We might want to do this via a new task state, e.g., TASK_REREGISTER_PENDING.

      This might be consistent with changing the task states so that we capture "task is partitioned" as an explicit task state (TASK_UNKNOWN or TASK_WANDERING) – see MESOS-4049.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                neilc Neil Conway
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated: