Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-10085

Operator API events are silently dropped on transient authorization failures.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Accepted
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      One of the purposes of the operator V1 API events is to allow subscribers maintain an up-to-date view of master's state: as a response to SUBSCRIBE call, the events subscriber first receives an initial view of master state and then receives updates to that view in the form of `Event`s.

      The parts of the state and updates to them which the subscriber's principal is not authorized to see, are filtered out by objectApprover::approve() method.

      In case of authorization failure, `approve()` returns an Error.
      Currently, the event filtering code handles `false` (i.e. not authorized) and Error in the same way: the event is dropped.
      (See https://github.com/apache/mesos/blob/f8a3dd334934094ec44e07fa350f958d218bc78f/src/common/http.hpp#L414 and, for example, https://github.com/apache/mesos/blob/f8a3dd334934094ec44e07fa350f958d218bc78f/src/master/master.cpp#L12257 )

      In presence of transient authorization failures, this can lead to inconsistencies in Event stream. The simplet example would be receiving TASK_UPDATED event without ever receiving TASK_ADDED for the task in question.
      Such inconsistencies may result in the subscriber being unable to maintain correct view of master's state.

      One of the options to fix this issue is to disconnect the subscriber in case of authorization failure, so that it gets the full master's view when it subscribes back.

      Note that before introduction of synchronous authorization (in Mesos 1.9 and earlier) this issue also existed, but the transient errors were happening in `Authorizer::getObjectApprover()` method which was then called per event (as opposed to per-subscriber after synchronous authz was introduced).

      Similar issue is present in processing of Operator API calls, including SUBSCRIBE call: the objects are silently dropped on transient authorization failures (see MESOS-10099).

      Attachments

        Issue Links

          Activity

            People

              dzhu Dong Zhu
              asekretenko Andrei Sekretenko
              Andrei Sekretenko Andrei Sekretenko
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: