Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-2507

Performance issue in the master when a large number of slaves are registering.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.23.0
    • master
    • Twitter Q2 Sprint 3 - 5/11
    • 5

    Description

      For large clusters, when a lot of slaves are registering, the master gets backlogged processing registration requests. perf revealed the following:

      Events: 14K cycles
       25.44%  libmesos-0.22.0-x.so  [.] mesos::internal::master::Master::registerSlave(process::UPID const&, mesos::SlaveInfo const&, std::vector<mesos::Resource, std::allocator<mesos::Resource> > cons
       11.18%  libmesos-0.22.0-x.so  [.] pipecb
        5.88%  libc-2.5.so             [.] malloc_consolidate
        5.33%  libc-2.5.so             [.] _int_free
        5.25%  libc-2.5.so             [.] malloc
        5.23%  libc-2.5.so             [.] _int_malloc
        4.11%  libstdc++.so.6.0.8      [.] std::string::assign(std::string const&)
        3.22%  libmesos-0.22.0-x.so  [.] mesos::Resource::SharedDtor()
        3.10%  [kernel]                [k] _raw_spin_lock
        1.97%  libmesos-0.22.0-x.so  [.] mesos::Attribute::SharedDtor()
        1.28%  libc-2.5.so             [.] memcmp
        1.08%  libc-2.5.so             [.] free
      

      This is likely because we loop over all the slaves for each registration:

      void Master::registerSlave(
          const UPID& from,
          const SlaveInfo& slaveInfo,
          const vector<Resource>& checkpointedResources,
          const string& version)
      {
        // ...
      
        // Check if this slave is already registered (because it retries).
        foreachvalue (Slave* slave, slaves.registered) {
          if (slave->pid == from) {
            // ...
          }
        }
        // ...
      }
      

      Attachments

        Activity

          People

            bmahler Benjamin Mahler
            bmahler Benjamin Mahler
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: