Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-3051

performance issues with port ranges comparison

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.1
    • Fix Version/s: 0.24.2, 0.25.0
    • Component/s: allocation
    • Labels:
    • Target Version/s:
    • Sprint:
      Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19
    • Story Points:
      8

      Description

      Testing in an environment with lots of frameworks (>200), where the frameworks permanently decline resources they don't need. The allocator ends up spending a lot of time figuring out whether offers are refused (the code path through HierarchicalAllocatorProcess::isFiltered().

      In profiling a synthetic benchmark, it turns out that comparing port ranges is very expensive, involving many temporary allocations. 61% of Resources::contains() run time is in operator -= (Resource). 35% of Resources::contains() run time is in Resources::_contains().

      The heaviest call chain through Resources::_contains is:

      Running Time          Self (ms)         Symbol Name
      7237.0ms   35.5%          4.0            mesos::Resources::_contains(mesos::Resource const&) const
      7200.0ms   35.3%          1.0             mesos::contains(mesos::Resource const&, mesos::Resource const&)
      7133.0ms   35.0%        121.0              mesos::operator<=(mesos::Value_Ranges const&, mesos::Value_Ranges const&)
      6319.0ms   31.0%          7.0               mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Ranges const&)
      6240.0ms   30.6%        161.0                mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
      1867.0ms    9.1%         25.0                 mesos::Value_Ranges::add_range()
      1694.0ms    8.3%          4.0                 mesos::Value_Ranges::~Value_Ranges()
      1495.0ms    7.3%         16.0                 mesos::Value_Ranges::operator=(mesos::Value_Ranges const&)
       445.0ms    2.1%         94.0                 mesos::Value_Range::MergeFrom(mesos::Value_Range const&)
       154.0ms    0.7%         24.0                 mesos::Value_Ranges::range(int) const
       103.0ms    0.5%         24.0                 mesos::Value_Ranges::range_size() const
        95.0ms    0.4%          2.0                 mesos::Value_Range::Value_Range(mesos::Value_Range const&)
        59.0ms    0.2%          4.0                 mesos::Value_Ranges::Value_Ranges()
        50.0ms    0.2%         50.0                 mesos::Value_Range::begin() const
        28.0ms    0.1%         28.0                 mesos::Value_Range::end() const
        26.0ms    0.1%          0.0                 mesos::Value_Range::~Value_Range()
      

      mesos::coalesce(Value_Ranges) gets done a lot and ends up being really expensive. The heaviest parts of the inverted call chain are:

      Running Time	Self (ms)		Symbol Name
      3209.0ms   15.7%	3209.0	 	mesos::Value_Range::~Value_Range()
      3209.0ms   15.7%	0.0	 	 google::protobuf::internal::GenericTypeHandler<mesos::Value_Range>::Delete(mesos::Value_Range*)
      3209.0ms   15.7%	0.0	 	  void google::protobuf::internal::RepeatedPtrFieldBase::Destroy<google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler>()
      3209.0ms   15.7%	0.0	 	   google::protobuf::RepeatedPtrField<mesos::Value_Range>::~RepeatedPtrField()
      3209.0ms   15.7%	0.0	 	    google::protobuf::RepeatedPtrField<mesos::Value_Range>::~RepeatedPtrField()
      3209.0ms   15.7%	0.0	 	     mesos::Value_Ranges::~Value_Ranges()
      3209.0ms   15.7%	0.0	 	      mesos::Value_Ranges::~Value_Ranges()
      2441.0ms   11.9%	0.0	 	       mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
       452.0ms    2.2%	0.0	 	       mesos::remove(mesos::Value_Ranges*, mesos::Value_Range const&)
       169.0ms    0.8%	0.0	 	       mesos::operator<=(mesos::Value_Ranges const&, mesos::Value_Ranges const&)
        82.0ms    0.4%	0.0	 	       mesos::operator-=(mesos::Value_Ranges&, mesos::Value_Ranges const&)
        65.0ms    0.3%	0.0	 	       mesos::Value_Ranges::~Value_Ranges()
      
      2541.0ms   12.4%	2541.0	 	google::protobuf::internal::GenericTypeHandler<mesos::Value_Range>::New()
      2541.0ms   12.4%	0.0	 	 google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler::Type* google::protobuf::internal::RepeatedPtrFieldBase::Add<google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler>()
      2305.0ms   11.3%	0.0	 	  google::protobuf::RepeatedPtrField<mesos::Value_Range>::Add()
      2305.0ms   11.3%	0.0	 	   mesos::Value_Ranges::add_range()
      1962.0ms    9.6%	0.0	 	    mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
       343.0ms    1.6%	0.0	 	    mesos::ranges::add(mesos::Value_Ranges*, long long, long long)
      
      236.0ms    1.1%	0.0	 	  void google::protobuf::internal::RepeatedPtrFieldBase::MergeFrom<google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler>(google::protobuf::internal::RepeatedPtrFieldBase const&)
      1471.0ms    7.2%	1471.0	 	google::protobuf::internal::RepeatedPtrFieldBase::Reserve(int)
      1333.0ms    6.5%	0.0	 	 google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler::Type* google::protobuf::internal::RepeatedPtrFieldBase::Add<google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler>()
      1333.0ms    6.5%	0.0	 	  google::protobuf::RepeatedPtrField<mesos::Value_Range>::Add()
      1333.0ms    6.5%	0.0	 	   mesos::Value_Ranges::add_range()
      1086.0ms    5.3%	0.0	 	    mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
       247.0ms    1.2%	0.0	 	    mesos::ranges::add(mesos::Value_Ranges*, long long, long long)
      
      107.0ms    0.5%	0.0	 	 void google::protobuf::internal::RepeatedPtrFieldBase::MergeFrom<google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler>(google::protobuf::internal::RepeatedPtrFieldBase const&)
      107.0ms    0.5%	0.0	 	  google::protobuf::RepeatedPtrField<mesos::Value_Range>::MergeFrom(google::protobuf::RepeatedPtrField<mesos::Value_Range> const&)
      107.0ms    0.5%	0.0	 	   mesos::Value_Ranges::MergeFrom(mesos::Value_Ranges const&)
      105.0ms    0.5%	0.0	 	    mesos::Value_Ranges::CopyFrom(mesos::Value_Ranges const&)
      105.0ms    0.5%	0.0	 	     mesos::Value_Ranges::operator=(mesos::Value_Ranges const&)
      104.0ms    0.5%	0.0	 	      mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
        1.0ms    0.0%	0.0	 	      mesos::remove(mesos::Value_Ranges*, mesos::Value_Range const&)
        2.0ms    0.0%	0.0	 	    mesos::Resource::MergeFrom(mesos::Resource const&)
        2.0ms    0.0%	0.0	 	     google::protobuf::internal::GenericTypeHandler<mesos::Resource>::Merge(mesos::Resource const&, mesos::Resource*)
        2.0ms    0.0%	0.0	 	      void google::protobuf::internal::RepeatedPtrFieldBase::MergeFrom<google::protobuf::RepeatedPtrField<mesos::Resource>::TypeHandler>(google::protobuf::internal::RepeatedPtrFieldBase const&)
       29.0ms    0.1%	0.0	 	 void google::protobuf::internal::RepeatedPtrFieldBase::MergeFrom<google::protobuf::RepeatedPtrField<mesos::Resource>::TypeHandler>(google::protobuf::internal::RepeatedPtrFieldBase const&)
      
      898.0ms    4.4%	898.0	 	google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler::Type* google::protobuf::internal::RepeatedPtrFieldBase::Add<google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler>()
      517.0ms    2.5%	0.0	 	 google::protobuf::RepeatedPtrField<mesos::Value_Range>::Add()
      517.0ms    2.5%	0.0	 	  mesos::Value_Ranges::add_range()
      429.0ms    2.1%	0.0	 	   mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
       88.0ms    0.4%	0.0	 	   mesos::ranges::add(mesos::Value_Ranges*, long long, long long)
      379.0ms    1.8%	0.0	 	 void google::protobuf::internal::RepeatedPtrFieldBase::MergeFrom<google::protobuf::RepeatedPtrField<mesos::Value_Range>::TypeHandler>(google::protobuf::internal::RepeatedPtrFieldBase const&)
      

        Issue Links

          Activity

          Hide
          kaysoky Joseph Wu added a comment -

          Filed MESOS-5425 to follow up on further performance improvements.

          Show
          kaysoky Joseph Wu added a comment - Filed MESOS-5425 to follow up on further performance improvements.
          Hide
          yanyanhu Yanyan Hu added a comment -

          Hi, guys, I'm now trying to use Mesos to manage a container cluster in large scale. And I'm using Mesos-0.25.0 with Marathon stays upon it. But when I made test using this environment, I found we still suffered from this issue when Marathon allocated port resource randomly.

          In my test, three Mesos-slaves were activated with each one has available port resource of [31000-37000]. Then I tried to created more than 3000 tasks in three slave nodes. I found when task amount reached 3000, it cost nearly 800 milisecond to finish a calculation of "Resources available = slaves[slaveId].total - slaves[slaveId].allocated
          " which is performed in HierarchicalAllocatorProcess::allocate() function:
          https://github.com/apache/mesos/blob/0.25.0/src/master/allocator/mesos/hierarchical.hpp#L1284

          Since I have three Mesos-slaves, the total time consumption of each invoking for allocate() function is more than 2 seconds which make the performance of Mesos-master very terrible.

          So I tried to made a simple test to evaluate the performance of "Ranges" value calculation. I found the performance of subtraction operation is still not good:

          e.g. res1 = [1-6000], res2 = [1-1, 3-3, 5-5, ...]
          I changed the range_size of res2 and recorded the execution time for "res1 -= res", the result is as followed:
          (Test was done in a x86 VM which has 4 process cores and 16GB memory)

          res2 range_size execution time(second)
          1 0.003 (0.002 in kernel mode)
          100 0.011
          200 0.031
          400 0.121
          800 0.533
          1600 2.157

          By comparison, the performance of addition and comparison operations are much better. So looks like the current fix haven't completely resolved this problem. Based on our test, the Mesos-master's performance seriously suffered from this issue when task amount is more than 10000 with 20 activated Mesos-slave nodes.

          I haven't tried latest Mesos release, but after checking the code of src/common/values.cpp in master branch, I found the implementation of "Ranges" data type is almost the same as in 0.25.0 release:
          https://github.com/apache/mesos/blob/master/src/common/values.cpp
          https://github.com/apache/mesos/blob/0.25.0/src/common/values.cpp

          So I guess the problem is still there? So is there any way we can further optimize the implementation of "Ranges" data type so we can avoid this performance bottleneck? Thanks.

          Show
          yanyanhu Yanyan Hu added a comment - Hi, guys, I'm now trying to use Mesos to manage a container cluster in large scale. And I'm using Mesos-0.25.0 with Marathon stays upon it. But when I made test using this environment, I found we still suffered from this issue when Marathon allocated port resource randomly. In my test, three Mesos-slaves were activated with each one has available port resource of [31000-37000] . Then I tried to created more than 3000 tasks in three slave nodes. I found when task amount reached 3000, it cost nearly 800 milisecond to finish a calculation of "Resources available = slaves [slaveId] .total - slaves [slaveId] .allocated " which is performed in HierarchicalAllocatorProcess::allocate() function: https://github.com/apache/mesos/blob/0.25.0/src/master/allocator/mesos/hierarchical.hpp#L1284 Since I have three Mesos-slaves, the total time consumption of each invoking for allocate() function is more than 2 seconds which make the performance of Mesos-master very terrible. So I tried to made a simple test to evaluate the performance of "Ranges" value calculation. I found the performance of subtraction operation is still not good: e.g. res1 = [1-6000] , res2 = [1-1, 3-3, 5-5, ...] I changed the range_size of res2 and recorded the execution time for "res1 -= res", the result is as followed: (Test was done in a x86 VM which has 4 process cores and 16GB memory) res2 range_size execution time(second) 1 0.003 (0.002 in kernel mode) 100 0.011 200 0.031 400 0.121 800 0.533 1600 2.157 By comparison, the performance of addition and comparison operations are much better. So looks like the current fix haven't completely resolved this problem. Based on our test, the Mesos-master's performance seriously suffered from this issue when task amount is more than 10000 with 20 activated Mesos-slave nodes. I haven't tried latest Mesos release, but after checking the code of src/common/values.cpp in master branch, I found the implementation of "Ranges" data type is almost the same as in 0.25.0 release: https://github.com/apache/mesos/blob/master/src/common/values.cpp https://github.com/apache/mesos/blob/0.25.0/src/common/values.cpp So I guess the problem is still there? So is there any way we can further optimize the implementation of "Ranges" data type so we can avoid this performance bottleneck? Thanks.
          Hide
          anandmazumdar Anand Mazumdar added a comment -

          Jörg Schad, Can we file a followup JIRA based on comments by Jie Yu to consider using IntervalSet in the near future ? Ignore my comments if one already exists.

          Show
          anandmazumdar Anand Mazumdar added a comment - Jörg Schad , Can we file a followup JIRA based on comments by Jie Yu to consider using IntervalSet in the near future ? Ignore my comments if one already exists.
          Hide
          mcypark Michael Park added a comment -
          commit 4f9dd54cad125f9059cb0646d29f3e8ea680128f
          Author: Joerg Schad <joerg@mesosphere.io>
          Date:   Wed Sep 23 09:42:16 2015 -0700
          
              Refactored coalesce() for Value::Ranges.
          
              The goal of this refactoring was to reuse the Ranges objects as much as
              possible, as prior there was substantial time spend in
              allocation/destruction (MESOS-3051).
          
              Review: https://reviews.apache.org/r/38158
          
          Show
          mcypark Michael Park added a comment - commit 4f9dd54cad125f9059cb0646d29f3e8ea680128f Author: Joerg Schad <joerg@mesosphere.io> Date: Wed Sep 23 09:42:16 2015 -0700 Refactored coalesce() for Value::Ranges. The goal of this refactoring was to reuse the Ranges objects as much as possible, as prior there was substantial time spend in allocation/destruction (MESOS-3051). Review: https://reviews.apache.org/r/38158
          Hide
          nnielsen Niklas Quarfot Nielsen added a comment -

          Till Toenshoff Jörg Schad - Will you guys have this landed by tomorrow EOD?

          Show
          nnielsen Niklas Quarfot Nielsen added a comment - Till Toenshoff Jörg Schad - Will you guys have this landed by tomorrow EOD?
          Hide
          jieyu Jie Yu added a comment -

          Joris, totally! I have no intention to block you guys at all. Just thought it'll be great if this part of the code base can be cleaned up later using IntervalSet.

          Show
          jieyu Jie Yu added a comment - Joris, totally! I have no intention to block you guys at all. Just thought it'll be great if this part of the code base can be cleaned up later using IntervalSet.
          Hide
          jvanremoortere Joris Van Remoortere added a comment -

          Hi Jie,
          At the moment I think switching to IntervalSet would be a significant refactor in its own right; though I agree we should start a discussion around it.
          In the mean time I'd like to help Joerg clean up the current implementation and make sure that it scales for the multiple Mesos adopters that have started using the 100+ framework model.

          We would appreciate it if you had some insights as to the trade-offs between the 2 models to seed the longer term discussion.

          Thanks!

          Show
          jvanremoortere Joris Van Remoortere added a comment - Hi Jie, At the moment I think switching to IntervalSet would be a significant refactor in its own right; though I agree we should start a discussion around it. In the mean time I'd like to help Joerg clean up the current implementation and make sure that it scales for the multiple Mesos adopters that have started using the 100+ framework model. We would appreciate it if you had some insights as to the trade-offs between the 2 models to seed the longer term discussion. Thanks!
          Hide
          jieyu Jie Yu added a comment -

          Have you guys considered leveraging IntervalSet in stout (https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp) and get rid of those manual coalesce functions?

          Show
          jieyu Jie Yu added a comment - Have you guys considered leveraging IntervalSet in stout ( https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp ) and get rid of those manual coalesce functions?
          Hide
          js84 Jörg Schad added a comment - - edited

          The proposed approach is to reduce the number of new Ranges objects and try to reuse existing objects as much as possible to avoid the allocations/destructions seen above.
          We add an optimized

          coalesce(Value::Ranges* ranges)

          function.
          Furthermore we try to add invariants to Ranges (or least the coalesce function(s)).
          All coalesce functions now have the preconditions that their input range collection is sorted in ascending order of the begin value of each range. We feel this assumption is plausible as the coalesce is only called from values.cpp.

          void coalesce(Value::Ranges* ranges)
          The algorithm is the following: For each range we try to merge it with the following ranges (due to the sort precondition we don't
          have to check only until the first non-matching range). If ranges are overlapping/neighboring, we merge them and remove the merged ranges. If ranges are disjoint,
          we stop and continue checking from the next range.

          void coalesce(Value::Ranges* ranges, const Value::Range& range)
          The algorithm is the following: Recall that 'ranges' We try to to find the first range overlapping/neighboring with the added 'range' and merge both.
          If this extends the range (i.e. new end) we check whether further ranges need to be merged.

          Show
          js84 Jörg Schad added a comment - - edited The proposed approach is to reduce the number of new Ranges objects and try to reuse existing objects as much as possible to avoid the allocations/destructions seen above. We add an optimized coalesce(Value::Ranges* ranges) function. Furthermore we try to add invariants to Ranges (or least the coalesce function(s)). All coalesce functions now have the preconditions that their input range collection is sorted in ascending order of the begin value of each range. We feel this assumption is plausible as the coalesce is only called from values.cpp. void coalesce(Value::Ranges* ranges) The algorithm is the following: For each range we try to merge it with the following ranges (due to the sort precondition we don't have to check only until the first non-matching range). If ranges are overlapping/neighboring, we merge them and remove the merged ranges. If ranges are disjoint, we stop and continue checking from the next range. void coalesce(Value::Ranges* ranges, const Value::Range& range) The algorithm is the following: Recall that 'ranges' We try to to find the first range overlapping/neighboring with the added 'range' and merge both. If this extends the range (i.e. new end) we check whether further ranges need to be merged.
          Show
          js84 Jörg Schad added a comment - https://reviews.apache.org/r/38158/

            People

            • Assignee:
              js84 Jörg Schad
              Reporter:
              jamespeach James Peach
              Shepherd:
              Till Toenshoff
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development

                  Agile