Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9000

Operator API event stream can miss task status updates.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: HTTP API
    • Labels:
    • Target Version/s:
    • Sprint:
      Mesosphere Sprint 2018-27
    • Story Points:
      3

      Description

      As of now, the master only sends TaskUpdated messages to subscribers when the latest known task state on the agent changed:

        // src/master/master.cpp
        if (!protobuf::isTerminalState(task->state())) {
          if (status.state() != task->state()) {
            sendSubscribersUpdate = true;
          }
      
          task->set_state(latestState.getOrElse(status.state()));
        }
      

      The latest state is set like this:

      // src/messages/messages.proto
      message StatusUpdate {
        [...]
        // This corresponds to the latest state of the task according to the
        // agent. Note that this state might be different than the state in
        // 'status' because task status update manager queues updates. In
        // other words, 'status' corresponds to the update at top of the
        // queue and 'latest_state' corresponds to the update at bottom of
        // the queue.
        optional TaskState latest_state = 7;
      }
      

      However, the `TaskStatus` message included in an `TaskUpdated` event is the event at the bottom of the queue when the update was sent.

      So we can easily get in a situation where e.g. the first TaskUpdated has .status.state == TASK_STARTING and .state == TASK_RUNNING, and the second update with .status.state == TASK_RUNNNING and .state == TASK_RUNNING would not get delivered because the latest known state did not change.

      This implies that schedulers can not reliably wait for the status information corresponding to specific task state, since there is no guarantee that subscribers get notified during the time when this status update will be included in the status field.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bennoe Benno Evers
                Reporter:
                bennoe Benno Evers
                Shepherd:
                Alex R
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: