Details
-
Improvement
-
Status: Accepted
-
Major
-
Resolution: Unresolved
-
1.5.0
-
None
Description
Currently an Operation only contains a FrameworkID of originating frameworks, but e.g., not the full FrameworkInfo. This is problematic in master failover scenarios where a master might learn about an operation triggered by a framework unknown to it. The way the master implementation is structured, we would like to create tracking structures for that framework (e.g., to sync with the allocator down the line), but cannot do so since we can only learn this information when either the framework reregisters, or an agent running tasks of that framework reconciles with the master. We also cannot use conjured uo dummy information until we learn the true FrameworkInfo since some required fields in FrameworkInfo (namely FrameworkInfo.user) cannot be updated, see MESOS-703.
We should introduce a channel for agents to learn the full FrameworkInfo for all frameworks executing operations on its resources. For simplicity and symmetry with RunTaskMessage it seems that adding an explicit FrameworkInfo field to Operation would do the job (e.g., allow atomic information transfer when operations are sent to the agent or on reconciliation with newly elected masters.
Attachments
Issue Links
- blocks
-
MESOS-9649 Recover frameworks from reregistered agents with operations
- Open
- incorporates
-
MESOS-8536 Pending offer operations on resource provider resources not properly accounted for in allocator
- Resolved
- is blocked by
-
MESOS-9957 Sequence all operations on the agent
- Open
- is related to
-
MESOS-703 master fails to respect updated FrameworkInfo when the framework scheduler restarts
- Accepted