Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9537

SLRP sends inconsistent status updates for dropped operations.

    XMLWordPrintableJSON

Details

    Description

      The bug manifests in the following scenario:
      1. Upon receiving profile updates, the SLRP sends an UPDATE_STATE to the agent with a new resource version.
      2. At the same time, the agent sends an APPLY_OPERATION to the SLRP with the original resource version.
      3. The SLRP asks the status update manager (SUM) to reply with an OPERATION_DROPPED to the framework because of the resource version mismatch. The status update is required to be acked. Then, it simply discards the operation (i.e., no bookkeeping).
      4. The agent finds a missing operation in the UPDATE_STATE so it sends a RECONCILE_OPERATIONS.
      5. The SLRP asks the SUM to reply with an OPERATION_DROPPED to the agent (without a framework ID set) because it no longer knows about the operation.
      6. The SUM returns an error because the latter OPERATION_DROPPED is inconsistent with the earlier one since it does not have a framework ID.

      Attachments

        Activity

          People

            chhsia0 Chun-Hung Hsiao
            chhsia0 Chun-Hung Hsiao
            Benjamin Bannier Benjamin Bannier
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: