Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8536

Pending offer operations on resource provider resources not properly accounted for in allocator

    XMLWordPrintableJSON

Details

    • Mesosphere Sprint 74, Mesosphere Sprint 75, Mesosphere Sprint 76
    • 2

    Description

      The master currently does not accumulate the resources used by offer operations on master failover. While we create a datastructure to hold this information, we missed updating it.

      hashmap<FrameworkID, Resources> usedByOperations;
      
      if (provider.newOperations.isSome()) {
        foreachpair (const id::UUID& uuid,
                     const Operation& operation,
                     provider.newOperations.get()) {
          // Update to bookkeeping of operations.
          CHECK(!slave->operations.contains(uuid))
            << "New operation " << uuid.toString() << " is already known";
      
          Framework* framework = nullptr;
          if (operation.has_framework_id()) {
            framework = getFramework(operation.framework_id());
          }
      
          addOperation(framework, slave, new Operation(operation));
        }
      }
      
      allocator->addResourceProvider(
          slaveId,
          provider.newTotal.get(),
          usedByOperations);
      

      Here usedByOperations is not updated.

      This leads to problems when the operation becomes terminal and we try to recover the used resources which might not be known to the framework sorter inside the hierarchical allocator.

      Attachments

        Issue Links

          Activity

            People

              bbannier Benjamin Bannier
              bbannier Benjamin Bannier
              Greg Mann Greg Mann
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: