Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9889

Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave.

    XMLWordPrintableJSON

    Details

      Description

      At https://github.com/apache/mesos/blob/9932550e9632e7fbb9a45b217793c7f508f57001/src/master/master.cpp#L7707-L7708

      void Master::__reregisterSlave(
      ...
          foreachkey (FrameworkID frameworkId,
                     slaves.unreachableTasks.at(slaveInfo.id())) {
              ...
              foreach (TaskID taskId,
                       slaves.unreachableTasks.at(slaveInfo.id()).get(frameworkId)) {
      

      Our case is when network flapping, 3~4 agents reregister, then master would CPU full and could not process any requests during that period.

      After change

      -    foreachkey (FrameworkID frameworkId,
      -               slaves.unreachableTasks.at(slaveInfo.id())) {
      +    foreach (FrameworkID frameworkId,
      +               slaves.unreachableTasks.at(slaveInfo.id()).keys()) {
      

      The problem gone.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bmahler Benjamin Mahler
                Reporter:
                haosdent@gmail.com haosdent
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: